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OM nucleic - nucleic search, using sw model 

Run on: August 9, 2005, 12:07:39 ; Search time 698 Seconds 

(without alignments) 
9244.295 Million cell updates/sec 

Title: US-10-653-681B-1 
Perfect score: 1090 

Sequence: 1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 

Scoring table: OLIGO_NUC 

Gapop 60.0 , Gapext 60.0 

searched: 4390206 seqs, 2959870667 residues 

Word size : 0 

Total number of hits satisfying chosen parameters: 



8780412 



Minimum DB seq length: 
Maximum DB seq length: 



2000000000 



Post-processing: Listing first 45 summaries 



Database : 



N_Geneseq_16Dec04 : * 
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geneseqnl980s: * 
geneseqnl990s : * 
geneseqn2000s : * 
geneseqn2001as:* 
geneseqn2001bs:* 
geneseqn2002as:* 
geneseqn2002bs:* 
geneseqn2003as:* 
geneseqn2003bs:* 
geneseqn2003cs:* 
geneseqn2003ds:* 
geneseqn2004as:* 
geneseqn2004bs:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



% 

Result Query 

No. Score Match Length DB ID 
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Ads73134 Human kid 
Acal0645 Human lun 
Abx99596 Lung cane 
Adh45842 Human lun 



Ade72379 
Ad j 19761 
Adr98739 
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Aas39335 
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1 un 
1 un 
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Adh36746 Human lun 
Adm56549 Human lun 
Adn89593 Human lun 
Ada28650 Human lun 
Human lun 
Human* lun 
Human lun 
Human lun 
Human lun 



Aaz24591 
Aac65830 
Abl 49049 
Abq92235 
Ade53610 



Adh36745 Human lun 
Adm56548 Human lun 
Adn89592 Human lun 
Aas39333 Novel hum 
Aas91091 DNA encod 
Aas68606 DNA encod 
Aai 92428 Human pol 
Adp28822 Human sec 
Adr98738 Lung spec 
Ach84956 Human gen 



RESULT 9 
AAF68405 

ID AAF68405 standard; cDNA; 1316 BP. 
XX 

AC AAF68405; 
XX 

DT 12-APR-2001 (first entry) 
xx 

DE Human lung tumour protein related nucleotide sequence SEQ ID NO: 323. 

XX 

KW Human; lung cancer; lung tumour; lung tumour protein; gene therapy; 

KW lung cancer antigen; lunq tumour-specific antigen; diagnosis; vaccine; 

KW cytostatic; anti sense inhibition; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200100828-A2. 
XX 

PD 04-JAN-2001. 

XX 

PF 30-DUN-2000; 2000WO-US018061. 
XX 

PR 30-JUN-1999; 99US-00346492 . 

PR 15-OCT-1999; 99US-00419356. 

PR 17-DEC-1999; 99US-00466867. 

PR 30-DEC-1999; 99US-00476300. 
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2000US-00519642. 



2000US-00533077. 
2000US-00546259. 
2000US-00560406. 
2000US-00589184. 



PR 06-MAR-2000 

PR 22-MAR-2000 

PR 10-APR-2000 

PR 27-APR-2000 

PR 05-JUN-2000 
XX 

PA (C0RI-) CORIXA CORP. 
XX 

PI Wang T, Bangur CS, Lodes MJ, Fanger GR, Vedvick TS, Carter D; 

PI Retter MW, Mannion 3; 

XX 

DR WPI; 2001-071488/08. 
XX 

PT Lung tumor-associated proteins and the nucleic acids that encode them, 

PT useful for preventing, diagnosing and treating lung cancer. 

XX 

PS Example 1; Page 249-250; 436pp; English. 
XX 

CC The present invention describes immunogenic portions of lung tumour- 

CC associated proteins (I) and the nucleic acids (NAs) that encode them. (I) 

CC have cytostatic activity and can be used in gene therapy, antisense 

CC inhibition and in vaccines. The NAs and the lung tumour-associated 

CC proteins they encode may be used in the prevention, treatment and 

CC diagnosis of diseases associated with their inappropriate expression, 

CC especially lung cancers. For example, the NAs may be administered to 

CC treat diseases by rectifying mutations or deletions in a patient's genome 

CC that affect the activity of the protein by expressing inactive proteins 

CC or to supplement the patients own production of (I). Additionally, the 

CC NAs may be used to produce the lung-tumour associated protein, according 

CC to standard recombinant DNA methodology. Conversely, antisense NA 

CC molecules may be administered to down regulate protein expression by 

cc binding with the cells own genes and preventing their expression. Tne NA 

CC and complementary sequences may also be used as DNA probes in diagnostic 

CC assays to detect and quantitate the presence of similar NA sequences in 

CC samples, and hence which patients may be in need of treatment for lung 

cc cancer. The (I) may be used as antigens in the production of antibodies 

CC and in assays to identify modulators (agonists and antagonists) of the 

CC expression and activity of the protein. AAF68083 to AAF68878 and AAB76848 

CC to AAB76878 represent human lung tumour protein related nucleotide and 

CC protein sequences which are used in the exemplification of the present 

cc invention 

XX 

SQ Sequence 1316 BP; 385 A; 299 C; 308 G; 324 T; 0 U; 0 Other; 

Query Match 60.9%; Score 664; DB 5; Length 1316; 
Best Local Similarity 99.7%; Pred. No. 0; 

Matches 764; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 311 CCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAACCAGTGACTAA 370 

II II II II II Mill III I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I M I I 1 1 I II I I I 1 1 1 1 1 I I 1 1 

Db 528 CCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAACCAGTGACTAA 587 

Qy 371 CCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATCCAGTACTGCCACTCCAA 430 

1 1 1 1 1 1 II 1 1 1 1 Ml I M I II MINI II II II M II I I I II II II I I I I I II II II I I 

Db 588 CCAGGTTGAGTGTCACCCATACCTCACACAGGAGAAACTGATCCAGTACTGCCACTCCAA 647 

Qy 431 GGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTTGGGCCAAGCC 490 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 648 GGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTTGGGCCAAGCC 707 

Qy 491 AGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATTGCTGCAAAGCACAAAAA 550 

I t I I I I I I I I I I I I 1 I I I I I I I I I I i I I I i I I I I I I I I I I I I 1 I I I I I I I 1 1 I I I I I I I I 

Db 708 AGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATTGCTGCAAAGCACAAAAA 767 
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Qy 


r r *t 

551 


Db 


768 


Qy 


611 


Db 


828 


Qy 


671 


Db 


888 


Qy 


751 


Db 


948 


Qy 


791 


Db 


1008 


Qy 


o r i 

851 


Db 


1068 


Qy 


911 


Db 


1128 


Qy 


971 


Db 


1188 


Qy 


1031 


Db 


1248 



AACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTGTCATCCCCAA 

MM MM IIIMII II Mill Mill INI III Mill II llllllll Mill INN 

AACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTGTCATCCCCAA 
GTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTAAATTGAGTGA 

M M I M M M M M M M M M M M M M I M M I M I M 1 1 1 M M M I M I M 1 1 1 

GTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTAAATTGAGTGA 
TGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTAACGTGTTGCA 

III MMMM MM MMMMMIMM MINIM MINIM MM 1 1 1 1 1 1 1 1 i I 

TGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTAACGTGTTGCA 



610 
827 
670 
887 
730 
947 
790 



ATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGAGGTTGAATCTCCTGGTG 

1 1 1 1 1 1 1 1 1 1 1 1 1 II I 1 1 1 1 1 1 1 1 1 1 1 1 llllll Mill II MM MM MM MM 

ATCCTCTCATTTGGAAGACTATCCCTTCAATGCAGAATATTGAGGTTGAATCTCCTGGTG 100 7 



AGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACTCATGTCCCAT 

Mill MM III MMMMMMMMIMMMMMMMIMM MM II Mill 

AGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACTCATGTCCCAT 



850 
1067 
910 



TTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACGAGAATCGAGG 

I M I M 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 I II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 I 

TTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACGAGAATCGAGG 112 7 



TGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCACAGAAAAGCA 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I j I I I I I I 
TGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCACAGAAAAGCA 



970 
1187 
1030 



TGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAAATGTTTATTA 
I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAAATGTTTATTA 12 4 7 

AGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAA 10 7 6 

llllllll IIIMII M III III II II MIMIIMM MMMM 

AGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAA 12 9 3 



ABK38316; 

21-MAY-2002 (first entry) 

cdna encoding clone #18973 (L516S) of lung tumour protein. 

Lung tumour; cancer; T cell; immune response stimulator; cytostatic; 
gene; ss. 

Homo sapiens. 

WO200204514-A2. 

17-JAN-2002. 

10- JUL-2001; 2001WO-US022058. 

11- OUL-2000; 2000US-00614124. 
29-AUG-2000; 2000US-00651563 . 
08-SEP-2000; 2000US-00658824. 
26-SEP-2000; 2000US-00671325 . 
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PR 06-OCT-2000; 2000US-00677419. 

PR 30-OCT-2000; 20O0US-0O7O2705. 

PR 13-DEC-2000; 2000US-00736457. 

PR 03-MAY-2001; 2001US-00849626. 
XX 

PA (C0RI-) CORIXA CORP. 
XX 

PI Wang T, Watanabe Y, Henderson RA, Johnson JC, Retter MW; 

PI Marnerakis M, Carter D, Fanger GR, Vedvick TS, Bangur CS, Mcnabb A; 

PI Wang A, Fanger N, switzer A, Mcneill PD, Clapper JD; 

XX 

DR WPI; 2002-164634/21. 
XX 

PT Novel polynucleotide encoding a lung tumor polypeptide useful for 

PT stimulating and/or expanding T cells specific for a tumor protein. 
XX 1 

PS Example 1; SEQ ID NO 323; 223pp; English. 
XX 

CC The invention describes an isolated polynucleotide and polypeptide useful 

CC for stimulating and/or expanding T cells specific for a tumour protein 

cc for determining the presence of a cancer in a patient. A composition 

CC containing the polynucleotide and/or polypeptide is useful for treating a 

CC lung cancer in a patient. The polypeptide is useful for removing tumour 

CC eel Is from a biological sample. Tne polynucleotide is also useful as 

CC probe or primer to detect the level of mRNA encoding a tumour protein. 

CC This sequence encodes a lung tumour associated protein or protein 

CC fragment, described in the method of the invention. Note: The sequence 

cc data for this patent did not form part of the printed specification, but 

CC was obtained in electronic format directly from WIPO at 

CC ftp . wi po . i nt/pub/publ i shed_pct_sequences 

xx 

SQ Sequence 1316 BP; 385 A; 299 C; 308 G; 324 T; 0 U; 0 Other; 

Query Match 60.9%; Score 664; DB 6; Length 1316; 

Best Local similarity 99.7%; Pred. No. 0; 

Matches 764; conservative 0; Mismatches 2; Indels 0; Gaps 0; 
Qy 311 CCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAACCAGTGACTAA 370 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h i 'i 1 1 1 1 1 1 > 1 1 1 i 1 1 m 1 1 1 1 1 1 1 1 1 ii 1 1 i n i ii i 

Db 528 CCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAACCAGTGACTAA 5 8 7 

Qy 371 CCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATCCAGTACTGCCACTCCAA 430 

I I I I I I i I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I 
Db 588 CCAGGTTGAGTGTCACCCATACCTCACACAGGAGAAACTGATCCAGTACTGCCACTCCAA 647 

Qy 431 GGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTTGGGCCAAGCC 490 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 648 GGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTTGGGCCAAGCC 707 

Qy 491 AGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATTGCTGCAAAGCACAAAAA 550 

I I I I I I I I I I I I I II II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 708 AGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATTGCTGCAAAGCACAAAAA 767 

Qy 551 AACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTGTCATCCCCAA 610 

I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 768 AACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTGTCATCCCCAA 827 

Qy 611 GTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTAAATTGAGTGA 670 

i : 1 1 1 ii 1 1 1 !i I ; 1 1 i 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 M 1 1 !i 1 1 1 II I II 1 1 1 1 ! II I 

Db 828 GTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTAAATTGAGTGA 887 

Qy 671 TGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTAACGTGTTGCA 730 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Page 5 



us-10-653-681b-l.oligo.rng 
888 TGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTAACGTGTTGCA 947 

731 ATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGAGGTTGAATCTCCTGGTG 790 

I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
948 ATCCTCTCATTTGGAAGACTATCCCTTCAATGCAGAATATTGAGGTTGAATCTCCTGGTG 1007 

791 AGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACTCATGTCCCAT 850 
I I I I I II I I I I I I I I I II I III I II I I I I! I I I I I I I I I I I I I I I I I II I I I II I I I I I I 
1008 AGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACTCATGTCCCAT 1067 

851 TTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACGAGAATCGAGG 910 

I III Ml II III INI I Mill III Mill III II I II II III III Mill Mill III I 

1068 TTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACGAGAATCGAGG 1127 

911 TGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCACAGAAAAGCA 970 
I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I 
1128 TGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCACAGAAAAGCA 1187 

971 TGGCTTGAATAAGGAAATGACAA i"[ 1 Til CCACTTATCTGATCAGAACAAATGTTTATTA 1030 
I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1188 TGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAAATGTTTATTA 1247 

1031 AGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAA 1076 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I 1 I I I I I I I I 
1248 AGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAA 1293 
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GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on : 



Title: 

Perfect score: 
Sequence : 



August 9, 2005, 11:49:49 ; Search time 5147 Seconds 

(without alignments) 
10261.548 Million cell updates/sec 

US-10-653-681B-1 
1090 

1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 



Scoring table: OLIGO_NUC 

Gapop 60.0 , Gapext 60.0 

Searched: 4708233 seqs, 24227607955 residues 

Word size : 0 



Total number of hits satisfying chosen parameters: 9416466 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Listing first 45 summaries 



Database : GenEmbl : * 



1 




gb ba:* 


2 




gb htg:* 


3 




gb in:* 


4 




gb om : * 


5 




gb ov:* 


6 




gb_pat : * 


7 




gb ph:* 


8 




gb pi : * 


9 




gb_pr : * 


10: 


gb ro : * 


11: 


gb sts:* 


12: 


gb sy:* 


13: 


gb un : * 


14: 


gb vi : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 780 71.6 1336 6 CQ718316 

2 780 71.6 1337 9 HSU37100 



CQ718316 Sequence 
U37100 Homo sapien 





3 


729 


66. 


9 


1551 


9 


BC008837 


BC008837 


Homo sa.pl 




4 


729 


66. 


9 


1560 


6 


CQ776685 


CQ776685 


Sequence 




5 


721 


66. 


1 


1611 


9 


AF524864 


AF524864 


Homo sapi 




6 


664 


60. 


9 


1316 


6 


AR272611 


AR272611 


Seauence 




7 


664 


60. 


9 


1316 


6 


AR276192 


AR276192 


Seauence 




8 


664 


60. 


9 


1316 


6 


AR406467 


AR406467 


Seauence 




9 


664 


60. 


9 


1316 


6 


AR440317 


AR440317 


Sequence 




10 


664 


60. 


9 


1316 


6 


AR472475 


AR472475 


Sequence 




11 


664 


60. 


9 


1316 


6 


AR543128 


AR543128 


Sequence 




12 


664 


60. 


9 


1316 


6 


AX062696 


AX062696 


Sequence 




13 


664 


60. 


9 


1316 


6 


AX367613 


AX367613 


Sequence 




14 


664 


60. 


9 


1316 


9 


AF052577 


AF052577 


Homo sapi 




15 


574 


52 . 


7 


574 


9 


AF044961 


AF044961 


Homo sapi 




16 


410 


37 . 


6 


951 


9 


BT006794 


BT006794 


Homo sapi 




17 


410 


37 . 


6 


951 


12 


BT007750 


BT007750 Synthetic 




18 


409 


37 . 


5 


948 


9 


CR541801 


PRS4 1 Rf)1 






19 


331 


30. 


4 


163631 


9 


AC009276 


Av/ \j \j z? i \j 




c 


20 


331 


30. 


4 


170919 


9 


AC078847 


ArD7ft ft 4 7 


nvjlllvj b dpi 


c 


21 


331 


30. 


4 


177373 


2 


AP002452 


r\iZ \J \J z. 1 .J c. 






22 


331 


30. 


4 


196039 


2 


AC055757 


AC055757 


Homo sapi 




23 


328 


30. 


1 


364 


6 


AX247463 


AX247463 


Sequence 




24 


316 


29. 


0 


585 


6 


AR176414 


AR176414 


Sequence 




25 


316 


29. 


0 


585 


6 


BD226027 


BD226027 


Compound 




26 . 


316 


29. 


0 


585 


6 


BD275698 


BD275698 


COMPOUNDS 




27 


316 


29. 


0 


585 


6 


AR220483 


AR220483 


Sequence 




28 


316 


29. 


0 


585 


6 


AR255477 


AR255477 


Sequence 




29 


316 


29 . 


0 


585 


6 


AR281047 


AR281047 


Sequence 




30 


316 


29. 


0 


585 


6 


AR437838 


AR437838 


Sequence 




31 


316 


29 . 


0 


585 


6 


AR476374 


AR476374 


Sequence 




32 


316 


29. 


0 


585 


6 


AR48 6565 


AR4 8 6565 


Sequence 




33 


316 


29 . 


0 


585 


6 


AR541068 


AR541068 


Sequence 




34 


316 


29 . 


0 


585 


6 


AX365699 


AX365699 


Sequence 


c 


35 


316 


29 . 


0 


858 


6 


AR176413 


AR176413 


Sequence 


c 


36 


316 


29 . 


0 


858 


6 


BD226026 


BD226026 


Compound 


c 


37 


316 


29. 


0 


858 


6 


BD275697 


BD275697 


COMPOUNDS 


c 


38 


316 


29. 


0 


858 


6 


AR220482 


AR220482 


Sequence 


c 


39 


316 


29. 


0 


858 


6 


AR255476 


AR255476 


Sequence 


c 


40 


316 


29. 


0 


858 


6 


AR281046 


AR281046 


Sequence 


c 


41 


316 


29. 


0 


858 


6 


AR437837 


AR437837 


Sequence 


c 


42 


316 


29. 


0 


858 


6 


AR476373 


AR476373 


Sequence 


c 


43 


316 


29. 


0 


858 


6 


AR486564 


AR486564 


Sequence 


c 


44 


316 


29. 


0 


858 


6 


AR541067 . 


AR541067 


Sequence 


c 


45 


316 


29. 


0 


858 


6 


AX365698 


AX365698 


Sequence 



ALIGNMENTS 



RESULT 1 
CQ718316 
LOCUS 
2004 

DEFINITION 
ACCESSION 
VERSION 
KEYWORDS 



CQ718316 



1336 bp 



DNA 



linear PAT 03-FEB- 



Sequence 4250 from Patent WO02068579. 
CQ718316 

CQ718316.1 GI:42279173 



SOURCE , Homo sapiens (human) 
ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 

AUTHORS Venter, C. J., Adams, M.C., Li,P.W. and Myers, E.W. 

TITLE Kits, such as nucleic acid arrays, comprising a majority of 

humanexons or transcripts, for detecting expression and other uses 

thereof 

JOURNAL Patent: WO 02068579-A 4250 06-SEP-2002; 
PE Corporation (NY) (US) 
FEATURES Location/Qualifiers 
source 1. .1336 

/organism="Homo sapiens" 
/mol_type="unassigned DNA" 
/db_xref="taxon:9606" 

ORIGIN 

Query Match 71.6%; Score 780; DB 6; Length 1336; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 780; Conservative • 0; Mismatches 0; Indels 0; Gaps 

CCACTT CCAGAT CGAGAAGCT CTT GAACAAACCT GGACTGAAATATAAACCAGTGACTAA 370 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
C CACT T C CAGAT C GAGAAG C T CT T GAACAAAC CT G GACT GAAATAT AAAC CAGT GACTAA 616 

C C AGGT T GAGT GT CAC C CAT AC CT CAC GC AG GAGAAACT GAT C CAGT ACT GC CACT C CAA 430 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
C C AG GT T GAGT GT CAC C CAT AC CT CAC G C AG GAGAAACT GAT C CAGT ACT G C CACT C CAA 67 6 

GGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTTGGGCCJ^AGCC 490 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I.I I I I I I I I 
GGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTTGGGCCAAGCC 736 

AGAAGAC C C T T C C CT GCTG GAG GAT C C CAAGATT AAG GAGAT T G CT G CAAAG C ACAAAAA 550 
I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AGAAGACC CTT C C CTGCT GGAGGAT C C CAAGATT AAG GAGAT T GCT GCAAAGCACAAAAA 7 96 

AAC C GCAG C C C AG GTT CT GAT C CGT T T C CAT AT C CAGAGGAAT GT GATT GT CAT C C CCAA 610 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
AACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTGTCATCCCCAA 856 

GT CT GT GAC AC C AG CAC GCAT T GT T GAGAACATT C AG GT CTT T GACT T TAAAT T GAGT GA 67 0 
I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GT CT GT GAC AC CAG CAC GCAT T GT T GAGAACAT T CAG GT CTT T GACT T TAAAT T GAGT GA 916 

TGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTAACGTGTTGCA 730 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTAACGTGTTGCA 976 

AT C C T C T CAT T T G GAAG AC TAT C C C T T C GAT G C A G AAT AT T G AG GT T GAAT CTCCTGGTG 7 90 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ATC CT CT CATTT GGAAGACTAT CCCT T CGATGCAGAAT ATT GAGGTT GAAT CT CCT GGTG 



0; 




Qy 


311 


Db 


bo / 


Qy 


371 


Db 


617 


Qy 


431 


Db 


677 


Qy 


491 


Db 


737 


Qy 


551 


Db 


797 


Qy 


611 


Db 


857 


Qy 


671 


Db 


917 


Qy 


731 


Db 


977 


1036 





Qy 791 AGATT ATAC AG GAGAT T CT CT T T CT T C GC T GAAGT GT GACT AC CT C C ACT C ATGT C C CAT 850 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1037 AGATT AT ACAGGAGATT CT CTTTCTTCGCT GAAGT GT GACT ACCT C CACT C AT GT C C CAT 

1096 

Qy 851 TTTAGCCAAGCTTATTTAAGAT CACAGTGAACTTAGT CCT GTTATAGACGAGAAT CGAGG 910 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1097 T TTAG C CAAGCT TATT TAAGAT CACAGT GAACTT AGT CCT GTT ATAGAC GAGAAT C GAG G 

1156 

Qy 911 T GCT GT T TT AGACAT T TAT T T C T GTAT GT T CAAC TAG GAT C AGAAT AT CACAGAAAAGCA 970 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I 

Db 1157 T GCT GTT TT AGACAT T TAT T T C T GTAT GT T CAAC TAG GAT C AGAAT AT CACAGAAAAGCA 

1216 

Qy 971 T G GC T T GAATAAG GAAAT GACAAT T T T TT C CACT TAT CT GAT C AGAACAAAT GT T TATT A 

1030 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1217 T GGCT T GAATAAG GAAAT GACAAT TT TTT C CACT TAT CT GAT C AGAACAAAT GT T TATT A 

1276 

Qy 1031 AGCAT CAGAAAC T CT GC CAACACT GAG GAT GT AAAGAT CAATAAAAAAAATAATAAT CAT 

1090 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1277 AGCAT CAGAAAC T CT G C CAACACT GAG GAT GTAAAGAT CAATAAAAAAAATAATAAT CAT 

1336 



us-10-653-681b-l.Oligo.rng 



GenCore version 5.1.6 
copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic 
Run on: 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



Searched: 
Word size 



nucleic search, using sw model 

August 9, 2005, 12:07:39 ; Search time 698 Seconds 

(without alignments) 
9244.295 Million cell updates/sec 

US-10-653-681B-1 
1090 

1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 

OLIGO_NUC 

Gapop 60.0 , Gapext 60.0 
4390206 seqs, 2959870667 residues 
0 



Total number of hits satisfying chosen parameters: 



8780412 



Minimum DB seq length: 
Maximum DB seq length: 



2000000000 



Post-processing: Listing first 45 summaries 



Database 



N_Geneseq_16Dec04 : * 



1 

2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 



geneseqnl980s : * 
geneseqnl990s : * 
geneseqn2000s : * 
geneseqn2001as : * 
geneseqn2001bs : * 
geneseqn2002as: * 
geneseqn2002bs : * 
geneseqn2003as:* 
geneseqn2003bs:* 
geneseqn2003cs: * 
geneseqn2003ds:* 
geneseqn2004as:* 
geneseqn2004bs:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



% 

Result Query 

No. Score Match Length DB ID 



SUMMARIES 



Description 



1 


780 


71.6 


1337 


5 


AAS68608 


Aas68608 DNA encod 


2 


780 


71.6 


1337 


10 


ADD71032 


Add71032 Human aid 


3 


773 


70.9 


1549 


12 


ADK70274 


Adk70274 Respi rato 


4 


729 


66.9 


1508 


3 


AAC98140 


Aac98140 Human col 


5 


729 


66.9 


1560 


12 


ADJ75119 


Adj75119 Marker ge 


6 


729 


66.9 


1560 


12 


ADN04246 


Adn04246 Antipsori 


7 


729 


66.9 


1560 


13 


ACN38728 


Acn38728 Tumour-as 


8 


729 


66.9 


1560 


13 


ADS85007 


Ads85007 Human ato 


9 


664 


60.9 


1316 


5 


AAF68405 


Aaf 68405 Human lun 


10 


664 


60.9 


1316 


6 


ABK38316 


Abk38316 cDNA enco 
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11 


664 


60 


.9 


1316 


7 


ADS73134 


Ads73134 Human kid 




12 


664 


60 


.9 


1316 


8 


ACA10645 


Acal0645 Human lun 




13 


664 


60 


.9 


1316 


8 


ABX99596 


Abx99596 Lung cane 




14 


664 


60 


.9 


1316 


10 


ADH45842 


Adh45842 Human lun 




15 


664 


60 


.9 


1316 


12 


ADE72379 


Ade72379 Human lun 




16 


664 


60 


.9 


1316 


13 


ADJ 19761 


Ad j 19761 Human lun 




17 


529 


48.5 


770 


13 


ADR98739 


Adr98739 Lung spec 




18 


529 


48.5 


1621 


12 


ADH13722 


Adhl3722 Human ENZ 


c 


19 


389 


35 

•J J 


. 7 


558 


10 


ABZ84625 


Abz84625 Toxicolog 




20 


328 


30 


, i 


364 


4 


AAS39335 


Aas39335 Novel hum 




21 


316 


29 


.0 


585 


2 


AAZ24592 


Aaz24592 Human lun 




22 


316 


29 


.0 


585 


3 


AAC65831 


Aac65831 Human lun 




23 


316 


29 


.0 


585 


6 


ABL49050 


Abl 49050 Human lun 




24 


316 


29 


.0 


585 


6 


ABQ92236 


Abq92236 Human lun 




25 


316 


29 


.0 


585 


9 


ADA28651 


Ada28651 Human lun 




26 


316 


29 


o 


585 


10 


ADE53611 


Ade53611 Human lun 




27 


316 


29 


,0 


585 


10 


ADH36746 


Adh36746 Human lun 




28 


316 


29 


.0 


585 


12 


ADM56549 


Adm56549 Human lun 




29 


316 


29 


.0 


585 


12 


ADN89593 


Adn89593 Human lun 


c 


30 


316 


29 


.0 


857 


9 


ADA28650 


Ada28650 Human lun 


c 


31 


316 


29 


.0 


858 


2 


AAZ24591 


Aaz24591 Human lun 


c 


32 


316 


29 


.0 


858 


3 


AAC65830 


Aac65830 Human lun 


c 


33 


316 


29 


.0 


858 


6 


ABL49049 


Abl 49049 Human lun 


c 


34 


316 


29 


.0 


858 


6 


ABQ92235 


Abq92235 Human lun 
Aae53610 Human lun 


c 


35 


316 


29 


.0 


858 


10 


ADE53610 


c 


36 


316 


29 


.0 


858 


10 


ADH36745 


Adh36745 Human lun 


c 


37 


316 


29 


.0 


858 


12 


ADM56548 


Adm56548 Human lun 


c 


38 


316 


29 


.0 


858 


12 


ADN89592 


Adn89592 Human lun 




39 


304 


27 


.9 


356 


4 


AAS39333 


Aas39333 Novel hum 




40 


252 


23 


.1 


1396 


5 


AAS91091 


Aas91091 DNA encod 




41 


249 


22 


.8 


861 


5 


AAS68606 


Aas68606 DNA encod 




42 


235 


21 


.6 


830 


4 


AAI92428 


Aai 92428 Human pol 




43 


183 


16 


.8 


540 


12 


ADP28822 


Adp28822 Human sec 




44 


174 


16 


.0 


857 


13 


ADR98738 


Adr98738 Lung spec 


c 


45 


159 


14 


.6 


198 


12 


ACH84956 


Ach84956 Human gen 



ALIGNMENTS 



RESULT 1 
AAS68608 

ID AAS68608 standard; cDNA; 1337 BP. 
XX 

AC AAS68608; 
XX 

DT 13-FEB-2002 (first entry) 

XX 

DE DNA encoding novel human diagnostic protein #4412. 
xx 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2. 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US008631. 
XX 

PR 31-MAR-2000; 2000US-00540217. 

PR 23-AUG-2000; 2000US-00649167 . 
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XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu c, Tang YT; 

XX 

DR WPI; 2001-639362/73. 

DR P-PSDB; ABG04421. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensi cs, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 
XX 

PS claim 1; SEQ ID NO 4412; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II). The polynucleotides are also used 

cc in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II). (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II). (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensi cs, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. AAS64197-AAS94564 represent novel human diagnostic 

CC coding sequences of the invention. Note: Tne sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wi po . i nt/pub/publ i shed_pct_sequences 

XX 

SQ sequence 1337 BP; 390 A; 305 C; 318 G; 324 T; 0 U; 0 Other; 

Query Match 71.6%; Score 780; DB 5; Length 1337; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 780; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 311 CCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAACCAGTGACTAA 370 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 558 CCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAACCAGTGACTAA 617 

Qy 371 CCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATCCAGTACTGCCACTCCAA 430 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I 
Db 618 CCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATCCAGTACTGCCACTCCAA 677 

Qy 431 GGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTTGGGCCAAGCC 490 

! I I I I I I I I I I I I I I I I I I I I I I M I I I I I I M I I I I I I I II I I I I I I I I I I I I I I I I I 1 
Db 678 GGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTTGGGCCAAGCC 737 

Qy 491 AGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATTGCTGCAAAGCACAAAAA 550 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 738 AGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATTGCTGCAAAGCACAAAAA 797 

Qy 551 AACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTGTCATCCCCAA 610 

1 1 1 1 1 1 1 I 1 1 1 1 I 1 1 1 1 I ! I 1 1 I 1 1 1 1 II I 1 1 I ! I I I 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 II I 1 1 1 

Db 798 AACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTGTCATCCCCAA 857 
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611 GTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTAAATTGAGTGA 670 

I I I I I I I I I II I I I I I I I III I I IMIIIM MM I I I I I I II II I I I I I I II I II I II I 
858 GTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTAAATTGAGTGA 917 

671 TGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTAACGTGTTGCA 730 

M M M M M M M I M M M M M M M I M M M M M M M M M M M M M M M 

918 TGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTAACGTGTTGCA 977 
731 ATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGAGGTTGAATCTCCTGGTG 790 

MMMMMMMMMMMMMMMMMMMMMMMMMMMMIMI 

978 ATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGAGGTTGAATCTCCTGGTG 1037 
791 AGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACTCATGTCCCAT 850 

IIIIIMIIIIIIIIMIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIII 

10 3 8 AGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACTCATGTCCCAT 109 7 

851 TTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACGAGAATCGAGG 910 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1098 TTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACGAGAATCGAGG 115 7 

911 TGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCACAGAAAAGCA 970 

M M I M M M I M M I M M M M M M M M I M M M M M M M M M M M M 1 1 

1158 TGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCACAGAAAAGCA 1217 
971 TGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAAATGTTTATTA 1030 

1 1 1 1 M M II M I M M II M I II 1 1 MMMM 1 1 1 M M 1 1 1 1 1 1 II 1 1 1 1 M II M I 

1218 TGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAAATGTTTATTA 1277 
1031 AGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAAAAAATAATAATCAT 1090 

IIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIMIIIIII 

1278 AGC ATC AG AAACTCTGCCAAC ACTG AGG ATGTAAAG ATC AATAAAAAAAATAATAATC AT 1337 
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GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 
Word size 



August 9, 2005, 11:53:34 ; Search time 4250 Seconds 

(without alignments) 
9762.366 Million cell updates/sec 

US-10-653-681B-1 
1090 

1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 

OLIGO_NUC 

Gapop 60.0 , Gapext 60.0 
34239544 seqs, 19032134700 residues 
0 



Total number of hits satisfying chosen parameters; 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Listing first 45 summaries 



68479088 



Database 



EST: 



1 


gb 


estl: 




2 


gb 


"est2: 




3 


gb_ 


~htc:* 




4 


gb 


_est3: 




5 


gb 


est4 : 


* 


6 


gb 


est5 : 


* 


7 


gb 


est6 : 


* 


8 


gb 


gssl : 


* 


9 


gb 


gss2 : 


* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 



% 

Query 





No. 


Score 


Match 


Length 


DB 


ID 


Description 


c 


1 


629 


57.7 


735 


5 


BM981698 


BM981698 UI-CF-EN1 




2 


585 


53.7 


613 


6 


CB132708 


CB132708 K-EST0183 


c 


3 


570 


52.3 


593 


5 


BU677104 


BU677104 UI-CF-DU1 




4 


517 


47.4 


588 


2 


BE785963 


BE785963 601478213 


c 


5 


496 


45.5 


540 


1 


AA804597 


AA804597 nk97e06.s 




6 


493 


45.2 


623 


4 


BM793014 


BM793014 K-EST0073 


c 


7 


479 


43. 9 


595 


5 


BM983180 


BM983180 UI-CF-EN1 


c 


8 


454 


41.7 


644 


6 


CA450136 


CA450136 UI-CF-FN0 



c 


q 


4 4 fi 


4 n 

4 U . 


q 


4 Rfi 


1 

X 


/\X J J / 


f\x ^yzjj / 


/ / cuz . x 


Q 


1 0 


4 4 fi 

" T U 


4 0 


g 


458 


1 


AT74 4 S04 

Al / *I 1 JUi 


AT744S04 

r\X 1 It Jul 


TAirr DQa HQ v 


Q 


1 1 


4 4 6 


4 0 


q 

Z/ 


4 60 
1 \j \j 


1 

x 


AT?Q1 4 fi? 
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qm / jfiUH . x 


Q 


1 9 

X <C 


4 4 4 


4 n 


7 


4 


1 

X 


AT ?Q?709 


ZiT ?Q^7 09 
/\x J7 J /Ui 


t- rf fi fi^ n 1 v 
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4 ?9 


?9 
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VJ 


4 4 fi 
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s 


nyi 04 ft 7 fi 


RV T 0 4 P 7 
DA 1 U fi 0 / O 


DV 1 H4 ft 7 fi 
DA X U fi 0 / D 
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4 ?1 

1 -J X 
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z> . 




sJ z. 


4 


RMfi 1 Q fifi? 


RMft 1 Q fi fi ? 


W— PCTfl fl fi 1 
rv CiJ i U U 0 / 
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39 . 


1 
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5 
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RMQ7 Rfifi4 


TTT — fP-FMl 
U X \*>C £*1N ± 
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3 
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1 j 0 


A 
1 
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DOX -7 / O / *i 
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D 17 19 9 
Kj IX / X Z Z 


c 


2 1 


?7R 


?4 


7 


?Rfi 
j 0 0 


1 
X 


ATft?1 SI Q 


ZiTft?1 Q 


wj ft ynl X . X 


c 
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J U J 
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O 


4 4 S 
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X 
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/\X JUlJiJ 


A T ? m ?9q 


qnz / e u y . X 




23 
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4 Qfi 

*1 U 


0 
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AW ? 7 Q ? 4 1 

■rVvv J / ^ J*i 1 


AW?7 Q ^41 


MD fl — WTO 9 4 
1 V 1KU HI UZ fi 




24 
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1 

X 
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j 0 *t 


0 


RF7ft7ft70 

OH) 1 KJ / fj t \J 


RF7 ft 7 ft 7 0 
DEj / 0 / 0 / u 


fiOl 47qftl 9 




25 
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33 . 


0 


7 96 
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Rc;fift ? l Q fi 




fi 09 fi 9 q SO? 

O UZ DZ jjUj 


Q 


26 
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31 . 


7 
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2 


BE711936 


RP7 1 1 Q ?fi 


DV9-HT0fiQ 




97 


?4 1 

O 1 X 


?1 

J X . 




4 ft S 


7 


PV??4 fi7ft 


r\/? ? 4 fi7 ft 


T T "3 — TTTH 1 1 
X X10 U I U X X 


c 


9 fi 


??R 
o o o 


?1 

«J X • 


n 

\j 


4 1 s 


1 

X 


AAQ4 7 S 1 4 


aziq47^i 4 

,rt/\i7 4 / Jl4 


OCJ3 JllU x . s 




9 9 


?9 Q 


?o 




7 Rfi 


■j 


ROP? 1 ?fi1 


RD9 9 1 ? ft 1 
D^Z ZijOI 






?o 


?94 


9Q 


7 


?fiS 
j 0 j 




RD?7 7 4 71 
Dy j r / *± r X 


RO?7 7 4 7 1 


T T 9 — T TMfl m 
X XiZ UiYlU U / 




? 1 

O X 


?1 9 
o x _? 


9Q 


-> 


-51 Q 
J X -7 


4 


RMfi S ^9 Q9 


RMft S R 9 Q9 


l^— T^QTO 1 "5 ft 




?9 


?1 fi 


99 

Z -7 . 


n 

U 


7 ? Q 




dv 4 0 n 0 c c 
DAI 0 U O DO 


DY4 ft n ^ fi 
DAfi OUjDj 


iJJ\r DO DvJ 




?? 


?1 fi 


9 Q 
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n 
U 


11 / / 


4 


Dl v iO JOl^l 


DrlO D01Z1 




c 


?4 


?1 4 

O X *i 


9ft 


Q 
O 


fififi 

ODD 


1 

X 


AT Q9 4 7 
/\X 3^4 1 D 3 




wnjoauz . x 




? S 


?oo 


97 




?00 

JUU 


1 

X 


AT ?7 0fifi4 

/\X ^ / UDO'l 


AT970fifi4 
/\X <£ / U D D 4 


qui? ucu fi .x 




?fi 


9QQ 

^ z) y 


97 


4 


S4 Q 
j 1 _? 




RD?1 S?ft9 


DA-31 C O O O 

D^ J x JZ 0 Z 


KIj 1 1 U U X 




?7 


OQQ 


91 


4 


QOQ 


4 


Rr;i fiQ?7 ft 

DwX vJ5 J / O 


DPI CQ 07 O 


fi09 ?9 0Q?7 
D UZ jZUr J / 




? fi 


9 Q fi 

Z O 


91 


0 

Z 


?1 fi 
oxo 




RT T 1 7 S 4 fi 9 
DU x / J40Z 


DTT1 TC/icn 
DU X / D fl 






? Q 


Z O 


01 
Z / . 


1 
± 


jIU 


A 
H 


DJYI / O O 1 0 O 


DM /OJlOO 


K-ho 1 UudX 


C 


40 


292 


26. 


8 


346 


2 


BE775022 


BE775022 


IL2-UM007 




41 


285 


26. 


1 


582 


5 


BP278752 


BP278752 


BP278752 




42 


280 


25. 


7 


581 


5 


BP263763 


BP263763 


BP263763 




43 


272 


25. 


0 


912 


5 


BQ220848 


BQ220848 


AG EN COURT 




44 


248 


22. 


8 


535 


6 


CB147729 


CB147729 


K-EST0203 




45 


248 


22. 


8 


557 


4 


BG490449 


BG490449 


602519494 



ALIGNMENTS 



RESULT 1 

BM981698/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 



735 bp 
-f-17-0-UI.sl UI-CF- 
-f-17-0-UI 3', mRNA 



mRNA 
EN1 Homo 
sequence 



linear 
sapiens 



EST 21-FEB-2003 
cDNA clone 



BM981698 
UI-CF-ENl-adi 
UI-CF-ENl-adi 
BM981698 

BM981698.1 GI:19604453 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 735) 

Bonaldo, M. F. , Lennon,G. and Soares f M.B. 

Normalization and subtraction: two approaches to facilitate gene 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



discovery 

JOURNAL Genome Res. 6 (9), 791-806 (1996) 
MEDLINE 97044477 
PUBMED 8889548 
COMMENT Contact: McCray, PB 

McCray Lab 
University of Iowa 

2024 University of Iowa Med Labs, Iowa City, IA 52242, USA 

Tel: 319 356 4866 

Fax: 319 356 7171 

Email: paul-mccray@uiowa.edu 

Tissue Procurement: Dr. M. J. Welsh, University of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Researchers may obtain clones from Research 

Genetics (www.resgen.com) or from Open Biosystems 

(www.openbiosystems.com) . 

The following repetitive elements were found in this cDNA 
sequence: 1-4 4, >POLY_A#Simple_repeat (matched compliment) 
Seq primer: M13 FORWARD 
POLYA=Yes . 

FEATURES Location/Qualifiers 
source 1. .735 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="UI-CF-ENl-adi-f-17-0-UI" 

/tissue_type="Primary Lung Cystic Fibrosis Epithelial 
Cells" 

/dev_stage="Adult" 

/lab_host="DH10B (Life Technologies) (Tl phage resistant)" 
/clone_lib="UI-CF-ENl" 

/note="0rgan : Lung; Vector: pT7T3-Pac (Pharmacia) with a 
modified polylinker; Site_l: EcoR I; Site_2: Not I; 
UI-CF-EN1 is a normalized cDNA library containing the 
following tissue(s): Primary Lung Cystic Fibrosis 
Epithelial Cells. The library was constructed according to 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. First strand cDNA synthesis was primed with an 
oligo-dT primer containing a Not I site. Double stranded 
cDNA was ligated to an EcoR I adaptor, digested with Not 
I, and cloned directionally into pT7T3-Pac vector. The 
oligonucleotide used to prime the synthesis of 
first-strand cDNA contains a library tag sequence that is 
located between the Not I site and the (dT) 18 tail. The 
sequence tag for this library is CTGCTCAGGT. 
T AG_T I S S U E = H uma n Lung Epithelial Cell Lines untreated LPS 
6hr to LPS 24h 
TAG_LIB=UI-CF-EN1 
TAG__S EQ=CT GCTCAGGT " 

ORIGIN 

Query Match 57.7%; Score 629; DB 5; Length 735; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 629; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 
Db 



4 62 GGCTCTCCGGATAGACCTTGGGCCAAGCCAGAAGACCCTTCCCTGCTGGAGGATCCCAAG 521 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
645 GGCTCTCCGGATAGACCTTGGGCCAAGCCAGAAGACCCTTCCCTGCTGGAGGATCCCAAG 586 



522 



585 



582 



AT TAAG GAGATT G CT GCAAAG CACAAAAAAAC C G CAGC C CAG GTT CT GAT C C GT TTC CAT 581 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AT TAAGGAGAT T GCT G CAAAG CACAAAAAAAC C G CAGC C CAGGTT CT GAT C C GTTTC CAT 526 



AT C CAGAG GAAT GT GAT T GT CAT C C C CAAGTCT GT GACAC CAG C AC GC ATT GT TGAGAAC 64 1 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

525 AT C CAGAG GAAT GT GAT T GT CAT C C C CAAGT CT GT GACACCAG CAC GC ATT GT T GAGAAC 4 66 

642 AT T CAGGT CT T T GACTTTAAATT GAGT GAT GAG GAGAT G GCAAC C ATACT CAGCT T CAAC 701 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

4 65 AT T C AG GT CT TT GACT T T AAATT GAGT GAT GAG GAGAT GGCAAC C ATACT CAG CT T CAAC 4 06 



702 



761 



AGAAACT GGAGG GC CT GT AAC GT GT T GCAAT C CT CT C ATTT G GAAGACT AT C C CT T C GAT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
4 05 AGAAACTGGAGGGCCTGTAACGTGTTGCAATCCTCTCATTTGGAAGACTATCCCTTCGAT 34 6 



762 



821 



G C AGAAT ATT GAGGT T GAAT CT C CT G GT GAGATT AT ACAGGAGAT TCTCTTTCTTCGCTG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
345 GCAGAATATT GAGGTT GAATCTC CT GGT GAGATTATACAGGAGATTCT CTTT CTT CGCTG 286 



822 



881 



AAGT GT GACT AC C T C CAC T CAT GT C C CAT TT TAG C CAAGC T TAT T T AAGAT CAC AGT GAA 
I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
285 AAGTGTGACTACCTCCACTCATGTCCCATTTTAGCCAAGCTTATTTAAGATCACAGTGAA 226 



882 



941 



CTTAGTCCTGTTATAGACGAGAATCGAGGTGCTGTTTTAGACATTTATTTCTGTATGTTC 
I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
225 C T T AGT C CT GT T AT AGAC GAGAAT C GAG GT GCT GT T T T AGAC AT T TAT T T C T GT AT GT T C 166 

942 AAC TAG GAT CAGAAT AT CACAGAAAAG CAT GGCT T GAATAAG GAAAT GACAAT T TTT T C C 1001 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
165 AACT AGGAT CAGAAT AT CACAGAAAAGCAT GGCT T GAATAAGGAAAT GACAAT T T TT T C C 106 

1002 AC TT AT CT GAT C AGAACAAAT GT TT AT T AAGCAT CAGAAAC T CT GC CAACACT GAGGAT G 1061 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I 
AC T T AT CT GAT C AGAACAAAT GT T TAT TAAGCAT CAGAAAC T CT G C CAACACT GAGGAT G 4 6 



105 



1062 T AAAGAT CAATAAAAAAAATAATAAT CAT 1090 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
45 T AAAGAT CAATAAAAAAAATAATAAT CAT 17 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



August 9, 2005, 06:02:48 ; Search time 5146 Seconds 

(without alignments) 
10263.542 Million cell updates/sec 

US-10-653-681B-1 
1090 

1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

4708233 seqs, 24227607955 residues 



Total number of hits satisfying chosen parameters; 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



9416466 



Database 



GenEmbl : * 



1 




gb ba : * 


2 




gb htg:* 


3 




gb in:* 


4 




gb om: * 


5 




gb ov:* 


6 




gb pat:* 


7 




gb ph : * 


8 




gb pi : * 


9 




gb_pr : * 


10: 


gb ro : * 


11: 


gb sts:* 


12: 


gb__s y : * 


13: 


gb un : * 


14: 


gb vi : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result Query 

No. Score Match Length DB ID 



Description 



1 833 76.4 1337 9 HSU37100 

2 830.8 76.2 1336 6 CQ718316 



U37100 Homo sapien 
CQ718316 Sequence 



3 


828 . 2 


76. 


o 


1551 


9 


BC008837 






4 


828 . 2 


76. 


o 


1560 


6 


C0776685 


' ' woo j 




5 


820.2 


75. 


2 


1611 


9 


AF524864 


AFS?4 864 




6 


796. 6 


73 . 


1 


1316 


6 


AR2 72 611 


AR272 611 


^! ^rri ipnrp 


7 


796. 6 


73 . 


1 


1316 


6 


AR276192 


AR27 6192 


Spfiupn c p> 

»-/ w VJ LA d X v_» C 


8 


796. 6 


73 . 


1 


1316 


6 


AR4 064*67 


AR4 064 fi7 




9 


796 . 6 


73. 


1 


1316 


6 


AR4 4 0317 


AR4 4 0^1 7 




10 


796. 6 


73 . 


1 


1316 


6 


AR4 72 47 5 


AR4 7? 4 7 S 




11 


796. 6 


73. 


1 


1316 


6 


AR543128 


AR543128 




12 


796. 6 


73. 


1 


1316 


6 


AX062696 


AX062696 




13 


796. 6 


73 . 


1 


1316 


6 


AX367613 


AX367613 


Sequence 


14 


796. 6 


73 . 


1 


1316 


9 


AF052577 


AF052577 


Homo sapi 


15 


718. 


65 . 


9 


1315 


6 


AX743782 


AX743782 


Sequence 


16 


651 . 8 


59 . 


8 


3994 


9 


AL669847 


AL669847 


Human DNA 


17 


651 . 8 


59 . 


8 


121210 


9 


AL607022 


AL607022 


Human DNA 


18 


574 


52 . 


7 


574 


9 


AF044 961 


AF044961 


Homo sapi 


19 


508 


46 . 


6 


1080 


6 


AX772 965 


AX772965 


Sequence 


20 


459 . 4 


42 . 


1 


951 


9 


BT0067 94 


BT006794 


Homo sapi 


21 


459.4 


42 . 


1 


951 


12 


BT007750 


BT007750 Synthetic 


22 


458 . 4 


42 . 


1 


948 


9 


CR541801 


CR541801 


Homo sapi 


23 


431 


39 . 


5 


951 




AX 3 8 04 4 8 


AX380448 


Sequence 


24 


431 


39 . 


5 


951 




AX77?9fi? 


AX772962 


Sequence 


25 


395 . 6 


36 . 


3 


473 


9 


AY?4 79^1 


AY347931 


Macaca ra 


26 


377 . 8 


34 . 


7 


144234 


2 


AP002425 


AP002425 


Homo sapi 


27 


377 . 8 


34 . 


7 


144279 


2 


AP00157 0 

.11.17 UU1 J / u 


AP001570 


Homo sapi 


28 


377.8 


34 . 


7 


216972 


9 




AC067819 


Homo sapi 


29 


374.8 


34 . 


4 


585 




v**^ / Jl, J J J 


CQ732993 


Sequence 


30 


361 


33 . 


1 


137557 


9 


ACOOSQOQ 

/iV V W J ^ U J 


AC005909 


Homo sapi 


31 


360 . 4 


33 . 


1 


364 


6 


AX 74 7 4 6^ 


AX247463 


Sequence 


32 


358.4 


32 . 


9 


163631 

1 U J U J X 


9 


AC 0092 7 6 


AC009276 


Homo sapi 


33 


358.4 


32 . 


9 


170919 


9 


AP07 8 8 4 7 


AC078847 


Homo sapi 


34 


358.4 


32 . 


9 


177373 


2 


AP0024 5? 


AP002452 


Homo sapi 


35 


358.4 


32 . 


9 


196039 


2 


AC0557 57 


AC055757 


Homo sapi 


36 


357 . 2 


32 . 


8 


J_ C \) u 


1 0 


CGU ft 1 04 S 


U81045 Cricetulus 


37 


354 . 4 


32 . 


5 


1400 


10 


BC037 690 


BC03769C 


i Mus muscu 


38 


353.4 


32. 


4 


356 


6 


AX247461 


AX247461 


Sequence 


39 


353.4 


32. 


4 


1446 


10 


BC079133 


BC079133 


> Rattus no 


40 


344.6 


31. 


6 


1413 


10 


AF182168 


AF182168 


Rattus no 


41 


332 


30. 


5 


1315 


10 


BC005789 


BC005789 


1 Mus muscu 


42 


331. 6 


30. 


4 


1304 


10 


MMU04204 


U04204 Mus musculu 


43 


327.2 


30. 


0 


1225 


6 


CQ777549 


CQ777549 


Sequence 


44 


327.2 


30. 


0 


1225 


10 


MUSMVDP 


J05663 Mouse vas d 


45 


327 


30. 


0 


993 


10 


RN0277957 


AJ277957 


Rattus no 



ALIGNMENTS 



RESULT 1 
HSU37100 

LOCUS - HSU37100 1337 bp mRNA linear PRI 28-MAY- 

1998 

DEFINITION Homo sapiens aldose reductase-like peptide mRNA, complete cds . 

ACCESSION U37100 

VERSION U37100.1 GI:3150034 

KEYWORDS 



SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



CDS 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 1337) 

Cao,D., Fan,S.T. and Chung, S.S. 

Identification and characterization of a novel human aldose 
reductase-like gene 

J. Biol. Chem. 273 (19), 11429-11435 (1998) 

982-34319 

9565553 

2 (bases 1 to 1337) 
Cao, D. 

Direct Submission 

Submitted ( 27-SEP-1995 ) Deliang Cao, The University of Hong Kong, 
Institute of Molecular Biology, 8 Sassoon Road, Pokfulam, Hong 
Kong, Hong Kong 

Location/Qualifiers 

1. .1337 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="pCDL-l" 
/tissue__type="liver tumor" 
/dev_stage=" adult" 
70. .1020 
/codon_start=l 

/product="aldose reductase-like peptide" 
/protein_id="AAC17469. 1" 
/db xref="GI: 3150035" 



/trans la tion="MATFVELSTKAKMPIVGLGTWKSPLGKVKEAVKVAIDAGYRHID 



CAYVYQNEHEVGEAIQEKIQEKAVKREDLFIVSKLWPTFFERPLVRKAFEKTLKDLKL 

SYLDVYLIHWPQGFKSGDDLFPKDDKGNAIGGKATFLDAWEAMEELVDEGLVKALGVS 

NFSHFQIEKLLNKPGLKYKPVTNQVECHPYLTQEKLIQYCHSKGITVTAYSPLGSPDR 

PWAKPEDPSLLEDPKIKEIAAKHKKTAAQVLIRFHIQRNVIVIPKSVTPARIVENIQV 

FDFKLSDEEMATILSFNRNWRACNVLQSSHLEDYPFDAEY" 
polyA_site 1317. .1322 

ORIGIN 

Query Match 76.4%; Score 833; DB 9; Length 1337; 

Best Local Similarity 81.5%; Pred. No. 1.4e-221; 

Matches 1090; Conservative 0; Mismatches 0; Indels 247; Gaps 

l; 

Qy 1 CAAAAACAG CAACAGAAAGCAGGAC GT GAGACT T CT ACCT G CT CAC T CAGAAT CATTT CT 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 CAAAAACAGCAACAGAAAGCAGGAC GT GAGACTT CTACCT GCTCACT CAGAAT CATTTCT 60 

Qy 61 GCAC CAAC CAT G G C CAC GT TT GT G GAGCT CAGTAC CAAAGCCAAGAT G C C CATT GT G GG C 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 61 GCAC CAAC CAT GG C CAC GT T T GT GGAG CT CAGTAC CAAAG C CAAGAT G C C CAT T GT G GG C 120 



121 CTGGGCACTTGGAAGTCTCCTCTCGGCAAAGTGAAAGAAGCAGTGAAGGTGGCCATTGAT 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I 
121 CTGGGCACTTGGAAGTCTCCTCTCGGCAAAGTGAAAGAAGCAGTGAAGGTGGCCATTGAT 180 

181 GCAGGAT AT C GG CACATTGACT GT GC C TAT GT CTAT CAGAAT GAAC AT GAAGT GGG GGAA 240 

I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 GCAGGAT AT C GGCACATTGACT GT GCCTAT GT CTAT CAGAAT GAACAT GAAGT GGGGGAA 240 

2 41 GCCAT C CAAGAGAAGAT CCAAGAGAAGGCT GT GAAGCGGGAGGACCT GTTCATCGT CAGC 300 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
241 GC CAT C CAAGAGAAGAT C CAAGAGAAGGCT GT GAAG C GG GAGGAC CT GT T C ATC GT CAGC 300 

301 AAGTTGTGGC 310 

MINIMI 

301 AAGT T GT G G C C CACT T T CTTT GAGAGACC C CT T GT GAGGAAAGC CT T T GAGAAGAC C CT C 360 

311 310 

361 AAGGACCTGAAGCTGAGCTATCTGGACGTCTATCTTATTCACTGGCCACAGGGATTCAAG 420 

311 310 

421 T CT G GG GAT GAC CT T TT C C C CAAAGAT GATAAAG GTAAT G C CAT C G GT G GAAAAG CAAC G 480 

311 310 

481 TTCTTGGATGCCTGGGAGGCCATGGAGGAGCTGGTGGATGAGGGGCTGGTGAAAGCCCTT 540 

311 C CACT T C CAGAT C GAGAAG CT C T T GAACAAAC CT GGACT GAAA 353 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
541 GGGGTCTC CAAT T T CAG C CAC T T C CAGAT C GAGAAG CTCT T GAACAAAC CT G GACT GAAA 600 

354 T ATAAAC CAGT GACTAAC CAG GT T GAGT GT CAC C CAT AC CT CAC G CAGGAGAAACT GAT C 413 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
601 TATAAACCAGTGACTAACCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATC 660 

414 CAGTACTGCCACTCCAAGGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGAT 473 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
661 CAGTACTGCCACTCCAAGGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGAT -720 

474 AGAC CT T G GG C CAAG C CAGAAGAC CCTTCCCTGCTG GAGGAT CC CAAGAT TAAGGAGAT T 533 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
721 AGAC CTT G G GC CAAGC CAGAAGAC CCTTCCCTGC T GGAGGAT C C CAAGAT TAAGGAGAT T 780 

534 GC T GCAAAG CACAAAAAAAC C GCAGC C CAG GT T C T GAT C C GT TT C CAT AT C CAGAG GAAT 593 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
781 GC T G CAAAG CACAAAAAAAC C G CAG C C CAGGT T CT GAT C C GT TT C CAT AT C CAGAGGAAT 84 0 

5 94 GT GAT T GT CAT C C C CAAGT CT GT GACAC CAG CAC GC ATT GTT GAGAACAT T CAG GT CTTT 653 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
841 GT GAT T GT CAT C C C CAAGT CT GT GACAC CAGCAC G CATT GT T GAGAACAT T CAGGT CTTT 900 



654 GACT T T AAAT T GAGT GAT GAGGAGAT GGCAAC CAT AC T CAG CTT CAACAGAAACT G GAG G 713 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I 
901 GAC T T T AAAT T GAGT GAT GAG G AGAT G G CAAC CAT AC T C AG C T T CAAC AGAAAC T G GAG G 960 



Qy 714 GC CT GTAAC GT GTT GCAAT C CT CT CAT TT GGAAGACTAT C C CT T C GAT G CAGAATATT GA 773 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 GC CT GTAAC GT GT T GCAAT C CT CT CAT TT GGAAGACTAT C C CT T C GAT G CAGAATATT GA 

1020 

Qy 774 GGTT GAAT CTCCTGGT GAGATT AT ACAGGAGAT TCTCTTTCTTCGCT GAAGT GT GACT AC 833 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 GGTTGAATCTCCTGGTGAGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTAC 

1080 

Qy 834 C T C CACT CAT GTC C CAT T TTAG C CAAGCTTATT TAAGAT C ACAGT GAACTTAGT CCT GTT 893 

II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 CT C CAC T CAT GT C C CATT TTAG C CAAGCT TAT T TAAGAT C ACAGT GAACTTAGT CCT GTT 
1140 

Qy 894 AT AGAC GAGAAT C GAGGT GCTGTT TTAGACATT T AT TT CT GT AT GTT CAACT AG GAT CAG 953 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 1141 AT AGAC GAGAAT C GAGGT GCT GTT T TAGACATT T AT T T CT GT AT GT T CAACT AG GAT CAG 

1200 

Qy 954 AAT AT CACAGAAAAGCAT GGCT T GAATAAGGAAAT GACAAT TT T T TC CACT TAT CT GAT C 

1013 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1201 AAT AT C AC AGAAAAG CAT G GCT T GAAT AAG GAAAT GACAAT T T T T T C CACT TAT CT GAT C 

1260 

Qy 1014 AGAACAAAT GTTTATTAAGCAT CAGAAACTCTGCCAACACT GAGGAT GTAAAGATCAATA 

1073 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 12 61 AGAACAAAT GT TT AT TAAGC AT CAGAAAC T CT G C CAAC ACT GAGGAT GT AAAGAT CAAT A 

1320 

Qy 1074 AAAAAAAT AAT AAT CAT 1090 

I I I I I I I I I I I I I I I I I 
Db 1321 AAAAAAAT AAT AAT CAT 1337 



us-10-653-681b-l.rng 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence: 



August 9, 2005, 05:31:49 ; Search time 697 seconds 

(without alignments) 
9257.558 Million cell updates/sec 

US-10-653-681B-1 
1090 

1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 



scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



searched: 



4390206 seqs, 2959870667 residues 



Total number of hits satisfying chosen parameters: 8780412 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database 



N_Geneseq_16Dec04 : * 



1 

2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 



geneseqnl980s:* 
geneseqnl990s:* 
geneseqn2000s : * 
geneseqn2001as : * 
geneseqn2001bs: * 
geneseqn2002as : * 
geneseqn2002bs : * 
geneseqn2003as:* 
geneseqn2003bs:* 
geneseqn2003cs:* 
geneseqn2003ds:* 
geneseqn2004as : * 
geneseqn2004bs:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



% 

Result Query 

No. Score Match Length DB ID 



SUMMARIES 



Description 



1 


833 


76 


4 


1337 


5 


AAS68608 


Aas68608 DNA encod 


2 


833 


76 


4 


1337 


10 


ADD71032 


Add71032 Human aid 


3 


832.2 


76 


3 


1508 


3 


AAC98140 


Aac98140 Human col 


4 


828.2 


76 


0 


1560 


12 


AD J 75119 


Ad j 75119 Marker ge 


5 


828.2 


76 


0 


1560 


12 


ADN04246 


Adn04246 Antipsori 


6 


828.2 


76 


0 


1560 


13 


ACN38728 


Acn38728 Tumour-as 


7 


828.2 


76 


.0 


1560 


13 


ADS85007 


Ads85007 Human ato 


8 


822.8 


75 


.5 


1549 


12 


ADK70274 


Adk70274 Respirato 


9 


796.6 


73 


.1 


1316 


5 


AAF68405 


Aaf 68405 Human lun 


10 


796.6 


73 


.1 


1316 


6 


ABK38316 


Abk38316 cDNA enco 
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11 


796.6 


73. 


1 


1316 


7 


ADS73134 




12 


796.6 


73. 


1 


1316 


8 


ACA10645 




13 


796.6 


73. 


1 


1316 


8 


ABX99596 




14 


796.6 


73. 


1 


1316 


10 


ADH45842 




15 


796.6 


73. 


1 


1316 


12 


ADE72379 




16 


796.6 


73. 


1 


1316 


13 


ADJ19761 




17 


770.8 


70. 


7 


1621 


12 


ADH13722 




18 


718 


65. 


9 


1315 


10 


ADC97771 




19 


635.4 


58. 


3 


1816 


11 


ACN92921 




20 


616 


56. 


5 


770 


13 


ADR98739 




21 


540.2 


49. 


6 


1170 


12 


ADH45334 




22 


508 


46. 


6 


1080 


9 


ACC83986 


c 


23 


439.8 


40. 


3 


558 


10 


ABZ84625 




24 


432 


39. 


6 


971 


10 


ADC10183 




25 


431 


39. 


5 


951 


6 


ABA94733 




26 


431 


39. 


5 


966 


10 


ADC10185 




27 


360.4 


33. 


1 


364 


4 


AAS39335 




28 


353.4 


32. 


4 


356 


4 


AAS39333 




29 


333.8 


30. 


6 


1926 


5 


AAS72230 




30 


333.8 


30. 


6 


1926 


5 


AAS92672 




31 


333.8 


30. 


6 


3620 


5 


AAS69995 




32 


327.2 


30. 


0 


1225 


12 


ADJ75983 




33 


317 


29. 


1 


540 


12 


ADP28822 




34 


316 


29. 


0 


585 


2 


AAZ24592 




35 


316 


29. 


0 


585 


3 


AAC65831 




36 


316 


29. 


0 


585 


6 


ABL49050 




37 


316 


29. 


0 


585 


6 


ABQ92236 




38 


316 


29. 


0 


585 


9 


ADA28651 




39 


316 


29.0 


585 


10 


ADE53611 




40 


316 


29. 


0 


585 


10 


ADH36746 




41 


316 


29. 


0 


585 


12 


ADM56549 




42 


316 


29. 


0 


585 


12 


ADN89593 


c 


43 


316 


29. 


0 


857 


9 


ADA28650 


c 


44 


316 


29. 


0 


858 


2 


AAZ24591 


c 


45 


316 


29.0 


858 


3 


AAC65830 



Ads73134 Human kid 
Acal0645 Human lun 
Abx99596 Lung cane 
Adh45842 Human lun 
Ade72379 Human lun 
Adi 19761 Human lun 
Adhl3722 Human ENZ 
Adc97771 Human ARL 
Acn92921 Breast ca 
Adr98739 Lung spec 
Adh45334 Human enz 
Acc83986 Human aid 
Abz84625 Toxicolog 
Adcl0183 Human NOV 
Aba94733 Human dru 

Adcl0185 Human NOV 
Aas39335 Novel hum 
Aas39333 Novel hum 
Aas72230 DNA encod 
Aas92672 DNA encod 
Aas69995 DNA encod 
Adj75983 Marker ge 
Adp28822 Human sec 
Aaz24592 Human lun 
Aac65831 Human lun 
Abl 49050 Human lun 
Abq92236 Human lun 
Ada28651 Human lun 
Ade53611 Human lun 
Adh36746 Human lun 
Adm56549 Human lun 
Adn89593 Human lun 
Ada28650 Human lun 
Aaz24591 Human lun 
Aac65830 Human lun 



ALIGNMENTS 



RESULT 1 
AAS68608 

ID AAS68608 standard; cDNA; 1337 BP. 
xx 

AC AAS68608; 
XX 

DT 13-FEB-2002 (first entry) 
XX 

DE DNA encoding novel human diagnostic protein #4412. 

XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder; ss. 

XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2. 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US008631. 
XX 

PR 31-MAR-2000; 2000US-00540217. 

PR 23-AUG-2000; 2000US-00649167 . 

Page 2 



us-10-653-681b-l.rng 

XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu c, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR P-PSDB; ABG04421. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 
XX 

PS claim 1; SEQ ID NO 4412; 103pp; English. 

XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

cc reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II). The polynucleotides are also used 

cc in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II). (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

cc polypeptide in tissue, as molecular weight markers and as a food 

cc supplement. (II) and its binding partners are useful in medical imaging 

cc of sites expressing (II). (I) and (II) are useful for treating disorders 

cc involving aberrant protein expression or biological activity. The 

cc polypeptide and polynucleotide sequences have applications in 

cc diagnostics, forensics, gene mapping, identification of mutations 

cc responsible for genetic disorders or other traits to assess biodiversity 

cc and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. AAS64197-AAS94564 represent novel human diagnostic 

CC coding sequences of the invention. Note: Trie sequence data for this 

cc patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wi po . i nt/pub/publ i shed_pct_sequences 

XX 

SQ sequence 1337 BP; 390 A; 305 C; 318 G; 324 T; 0 U; 0 other; 

Query Match 76.4%; Score 833; DB 5; Length 1337; 

Best Local Similarity 81.5%; Pred. No. l.le-230; 

Matches 1090; Conservative 0; Mismatches 0; Indels 247; Gaps 1; 

Qy 1 CAAAAACAGCAACAGAAAGCAGGACGTGAGACTTCTACCTGCTCACTCAGAATCATTTCT 60 

I I I I II I I I I I I I I I I I I I I I I I I I II I I I I II I I I 1 I I I I I I I I I I I I I I I I I I I I I II 
Db 1 CAAAAACAGCAACAGAAAGCAGGACGTGAGACTTCTACCTGCTCACTCAGAATCATTTCT 60 

Qy 61 GCACCAACCATGGCCACGTTTGTGGAGCTCAGTACCAAAGCCAAGATGCCCATTGTGGGC 120 

I III II I II II I II II I II I II I II III Mill II I II II III III 1 1 III II I II II 1 1 

Db 61 GCACCAACCATGGCCACGTTTGTGGAGCTCAGTACCAAAGCCAAGATGCCCATTGTGGGC 120 

Qy 121 CTGGGCACTTGGAAGTCTCCTCTCGGCAAAGTGAAAGAAGCAGTGAAGGTGGCCATTGAT 180 

I i I I I I ! I I I I I I 1 I I I I I I 1 1 I I I I I I I I I I I I i 1 I I 1 I I I I I I I I I I I I I I ! I I I I I I 

Db 121 CTGGGCACTTGGAAGTCTCCTCTCGGCAAAGTGAAAGAAGCAGTGAAGGTGGCCATTGAT 180 

Qy 181 GCAGGATATCGGCACATTGACTGTGCCTATGTCTATCAGAATGAACATGAAGTGGGGGAA 240 

I I I I I 1 I I I I I II I I ! I 1 I I I I I I I I I I I II I II I I I I I I I M I I ! I I I I II I I I I I I M 

Db 181 GCAGGATATCGGCACATTGACTGTGCCTATGTCTATCAGAATGAACATGAAGTGGGGGAA 240 

Qy 241 GCCATCCAAGAGAAGATCCAAGAGAAGGCTGTGAAGCGGGAGGACCTGTTCATCGTCAGC 300 

INI III MM 1 1 1 ! I I 1 1 I 1 1 I I I II ! 1 1 1 1 I 1 1 I 1 1 I I I 1 1 I 1 1 I I I 1 1 1 III II III 

Db 241 GCCATCCAAGAGAAGATCCAAGAGAAGGCTGTGAAGCGGGAGGACCTGTTCATCGTCAGC 300 
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Qy 


301 


1 1 1 1 1 1 1 1 1 1 

! II 1 II 1 II 1 

AAGTTGTGGCCCACTTTCTTTGAGAGACCCCTTGTGAGGAAAGCCTTTGAGAAGACCCTC 


310 


Db 


301 


360 


Qy 


311 




310 




Db 


361 


AAGGACCTGAAGCTGAGCTATCTGGACGTCTATCTTATTCACTGGCCACAGGGATTCAAG 


420 


Qy 


311 




310 




Db 


421 


TCTGGGGATGACCTTTTCCCCAAAGATGATAAAGGTAATGCCATCGGTGGAAAAGCAACG 


480 


Qy 


311 




310 




Db 


481 


TTCTTGGATGCCTGGGAGGCCATGGAGGAGCTGGTGGATGAGGGGCTGGTGAAAGCCCTT 


540 


Qy 

Db 


311 
541 


CCACTTCC AG ATC G AG AAG CTCTTG AAC AAAC CTG G ACTG AAA 

IMIIIIIIIIIIII llllll IIIIIIIIIIIIIIIIMIIII 
GGGGTCTCCAATTTCAGCCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAA 


353 
600 


Qy 

Db 


354 
601 


TATAAACCAGTGACTAACCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATC 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 

TATAAACCAGTGACTAACCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATC 


413 
660 


Qy 

Db 


414 

661 


CAGTACTGCCACTCCAAGGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGAT 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
CAGTACTGCCACTCCAAGGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGAT 


473 
720 


Qy 
Db 


474 

721 


AGACCTTGGGCCAAGCCAGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATT 

1 1 II Ml MM 1 1 1 II III Mill III III MMII II II II II II 1 II 1 M III III II 

AGACCTTGGGCCAAGCCAGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATT 


533 
780 


Qy 

Db 


534 
781 


GCTGCAAAGCACAAAAAAACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAAT 

IIIIIIIIIIIIMIMIIIIIMMIIIIIIIIIIIIIIIIIIMIMIIIIMIIMI 

GCTGCAAAGCACAAAAAAACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAAT 


593 
840 


Qy 

Db 


594 
841 


GTGATTGTCATCCCCAAGTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTT 
1 1 I 1 1 1 1 1 1 1 1 1 I 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 
GTGATTGTCATCCCCAAGTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTT 


653 
900 


Qy 

Db 


654 
901 


GACTTTAAATTGAGTGATGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGG 

1 MM III III III II 1 II lllllll 1 llllll III II 1 1 llllll III 1 II III III II 

GACTTTAAATTGAGTGATGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGG 


713 
960 


Qy 

Db 


714 
961 


GCCTGTAACGTGTTGCAATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 
GCCTGTAACGTGTTGCAATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGA 


773 
1020 


Qy 

Db 


774 
1021 


GGTTGAATCTCCTGGTGAGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTAC 

1 MIMIII IMIMIM M MIIIMI MM MM III 1 II MIIIMM II II III 

GGTTGAATCTCCTGGTGAGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTAC 


833 
1080 


Qy 

Db 


834 
1081 


CTCCACTCATGTCCCA 1 1 1 1 AGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTT 

1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M 1 1 

CTCCACTCATGTCCCATTTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTT 


893 
1140 


Qy 

Db 


894 
1141 


ATAGACGAGAATCGAGGTGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAG 

1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 

ATAGACGAGAATCGAGGTGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAG 


953 
1200 


Qy 

Db 


954 
1201 


AATATCACAGAAAAGCATGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATC 

IIIIIMIIIIIIIM II M IMIMM II MIMIIIIIIII 1 III MMMIIMI 

AATATCACAGAAAAGCATGGCTTGAATAAGGAAATGACAA llllll CCACTTATCTGATC 


1013 
1260 
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us-10-653-681b-l.rng 

QV 1014 AGAACAAATGTTTATTAAGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATA 1073 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1261 AGAACAAATGTTTATTAAGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATA 1320 

Qy 1074 AAAAAAATAATAATCAT 1090 

I I I I I I I I I I I I I I I I I 
Db 1321 AAAAAAATAATAATCAT 1337 
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GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: August 9, 2005, 07:25:59 ; Search time 4248 Seconds 

(without alignments) 
9766.962 Million cell updates/sec 

Title: US-10-653-681B-1 
Perfect score: 1090 

Sequence: 1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 



Scoring table: 



IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 34239544 seqs, 19032134700 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



68479088 



Post-processing : 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database 



EST: 



gb 
gb 
gb 
gb 
gb 
gb 
gb 
gb 
gb 



estl:* 
est2:* 
htc: * 
est3 : * 
est4 : * 
est5: * 
'est6: * 
gssl : * 
gss2 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 



% 

Query 





No. 


Score 


Match 


Length 


DB 


ID 


Description 


c 


1 


705.4 


64.7 


735 


5 


BM981698 


BM981698 UI-CF-EN1 




2 


685.2 


62.9 


1586 


3 


CR607509 


CR607509 full-leng 


c 


3 


666 


61. 1 


746 


2 


BF688991 


BF688991 602185236 


c 


4 


652.2 


59.8 


796 


5 


BX337598 


BX337598 BX337598 




5 


647 


59.4 


909 


4 


BG169378 


BG169378 602320937 


c 


6 


586. 6 


• 53.8 


666 


1 


AI924753 


AI924753 wn58a02.x 




7 


585 


53.7 


613 


6 


CB132708 


CB132708 K-EST0183 


c 


8 


573 


52.6 


595 


5 


BM983180 


BM983180 UI-CF-EN1 





Q 

7 


S70 ft 


S9 

J Z, • 


A 
H 


S93 
j 7 j 


J 


RU6771 f)4 

DUO/ /Xwt 




J. u 


S4 9 4 

Jit » 4 


4 Q 

4 -7 • 


Q 

O 


69 3 

UZ J 


A 


RM7 9 3D1 4 

CH 1 / 7 JU14 




1 1 

X X 


S96 9 


4 ft 


-5 
j 


969 


7 


roSRI 979 

V^VJJOX-7 /-7 




X z 


S99 
j z z 


4 7 

4 / • 


Q 

7 


i m 9 

X U X z 


j 


do Q4 36RD 

D\J 74 JOJU 


c 


1 3 


S9 1 6 


4 7 

4 / • 


q 
7 


S4 n 

J4u 


X 


Z\ Aft 04 ^97 

r\r\0 v 4 J 7 / 


<-« 
v_- 


1 4 


M ft 9 

-J X o • z. 


4 7 


j 


64 4 

U *i *i 




PA4 SOI 36 

^/A4 JUi. JO 




1 S 

X J 


SI 7 

J X f 


47 


A 


Sftft 


9 

Z, 


RF7ft S963 

D£j / O J 7 O J 




16 


506 . 8 


46 . 


5 


881 


7 


CO580799 




17 


502 . 6 


46 . 


]_ 


704 


7 


CDS ft 9 64 6 

\_/ J U l. U T U 




X u 


4 9R R 

*i J O ■ O 


45 . 


8 


620 




CRT 1 ft69S 

D i 1 U U J J 




X Z> 


4 ft 1 4 

4 O X • 4 


4 4 

4 4. 


9 
z 


63S 

O J J 


7 
i 


PV3 34 69 S 

v^VJJ*iOZJ 


c 


90 

Z. VJ 


4 7 S 4 
i / j • i 


4 3 
i j • 


U 


4 96 


9 


AW37934 1 

/AW J / 7 J 1 X 


c 


91 

Z X 


4 fin 


4 9 


9 


99 n 


9 
z. 


RF964 36ft 




z z 


4 S 6 4 

4 JO. 4 


4 1 
4 X . 


q 
7 


S99 

J 7Z 


4 


RMftl 9 6 63 

Dl v i 0 X 7 D D J 




9 3 
Z o 


4 4 Q d 
447,4 


4 1 
4 X . 


o 
z 


7ft 6 

/ 0 D 


c. 
J 


RH99 1 3 ft 1 
D^ZZ X J O X 


c 


z 4 


4 4 7 9 

4 4 / • Z 


4 1 

4 X • 


n 


4 S9 

4 J C, 


1 

X 


AT 3937D9 

/T.X J 7 J / U Z. 


c 


9 S 
z. 


4 4 6 4 

1 4 U i 4 


4 1 

41 ■ 


j 


456 


]_ 


AT999337 

Al Z, 7 Z. J J / 


i— 


96 

Z. yJ 


44 6 4 


41 . 


o 


458 


1 


AI7 4 4 504 




97 
z. / 


4 4 6 4 


41 . 


o 


4 60 


1 


AT991 463 

xA-L Z. 7 X 1 O J 




9 ft 
z, o 


4 4 6 4 

4 4 U • 4 


4 1 . 


o 


582 


5 


RP9787 S9 

D IT Z / O / Jl 


c 


9 9 

c. 7 


4 39 


39 

J 7 * 


5 


446 


5 


BX104 87 6 


<— 


30 

j u 


4 9 6 6 

4t U t u 


39 

J 7 . 


1 


4 4 5 
*i i j 


1 


AT301329 




J X 


426.4 


39 . 


1 


914 


7 


C0775128 


c 


3 9 

J Z 


4 9 6 

4 Z O 


3Q 
j 7 • 


i 

X 


4 4ft 

4 4 O 


j 


RM97 S664 

DIM 7 / J U \J *1 




3 7 
J J 


4 17 

4 X / 


3 ft 

JO . 


J 


4 4 S 

4 4 J 


a 
\j 


PR1 6119 4 




3d 

J 4 


A fl 9 4 
4 U Z . 4 


36 


q 
7 


Sft 1 
J O X 


c. 

J 


RP9 637 63 

DtZDJ / DJ 


c 


3 S 
J J 


3Q9 6 
J 7 Z . O 


36 


n 

u 


4 1 S 

4 X J 


i 

X 


AA94 7 SI 4 
/\rt.7 4 / J X 4 




36 
J D 


3ft ft ft 


35. 


7 


4 S 3 
i j j 


A 
*4 


RH1 97 ft 7 4 

DOl 7 / O / *i 




77 
J / 


3 ft 4 ft 
o 0 4 . 0 


35. 


3 


919 
7 x z 


J 


R09 9 0 ft 4 ft 

O^Z Z U O 4 O 


c 


j 0 


3 ft 3 4 

J O J . 4 


35. 


2 


4 0 R 

4UJ 


a 
o 


P7 Sfl7 S 

/ J \J / J 


c 




J / 0 


34. 


7 


7 P P 
JOO 


X 


ATft31 SI Q 
/^.X 0 J X J X 7 




40 


373 


34. 


2 


384 


2 


BE787870 


c 


41 


370. 6 


34. 


0 


386 


1 


AI813308 


c 


42 


365.2 


33.5 


388 


2 


BE711936 




43 


360 


33. 


0 


796 


4 


BG682196 




44 


351.2 


32. 


2 


1342 


3 


AK075865 




45 


348 


31. 


9 


1236 


3 


AK019906 



BU677104 UI-CF-DU1 
BM793014 K-EST0073 
C0581979 ILLUMIGEN 
BQ943650 AGENCOURT 
AA804597 nk97e06.s 
CA450136 UI-CF-FN0 
BE785963 601478213 
CO580792 ILLUMIGEN 
C0582646 ILLUMIGEN 
CB118695 K-EST0165 
CV334625 IL3-UT011 
AW379341 MR0-HT024 
BE964368 601658069 
BM819663 K-EST0087 
BQ221381 AGENCOURT 
AI393702 tg66d01.x 
AI292337 qm77c02.x 
AI744504 wg09a09.x 
AI291463 qm73h04.x 
BP278752 BP278752 
BX104876 BX104876 
AI301329 qn27e09.x 
C0775128 ILLUMIGEN 
BM975664 UI-CF-EN1 
CB161124 K-EST0220 
BP263763 BP263763 
AA947514 oq53h01.s 
BG197874 RST17122 
BQ22084 8 AGENCOURT 
C75075 C75075 Huma 
AI831519 wj49hll.x 
BE787870 601479812 
AI813308 wj33c01.x 
BE711936 QV2-HT069 
BG682196 602629503 
AK075865 Mus muscu 
AK019906 Mus muscu 



us-10-653-681b-l.oligo. mi 



GenCore version 5.1.6 
copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 

Run on: August 9, 2005, 08:31:19 ; Search time 227 seconds 

(without alignments) 
7857.012 Million cell updates/sec 

Title: US-10-653-681B-1 
Perfect score: 1090 

Sequence: 1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 

Scoring table: .OLIGOJMUC 

Gapop 60.0 , Gapext 60.0 

Searched: 1202784 seqs, 818138359 residues 

word size : 0 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 



2405568 



Post-processing: 
Database : 



: Listing first 45 summaries 

lssued_Patents_NA: * 

1 : /cgn2_6/ptodata/l/i na/5 A_C0MB . seq : * 

2 : /cgn2_6/ptodata/l/i na/5B_COMB . seq : * 

3 : /cgn2_6/ptodata/l/i na/6A_COMB . seq : * 

4 : /cgn2_6/ptodata/l/i na/6B_COMB . seq : * 

5 : /cgn2_6/ptodata/l/i na/PCTUS_COMB . seq : * 

6 : /cgn2_6/ptodata/l/i na/backf i 1 esl . seq : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



% 



SUMMARIES 



Resul t 




Query 








NO. 


Score 


Match 


Length DB 


ID 


1 


729 


, 66.9 


1515 


4 


US-09-949-016-1344 


2 


664 


60.9 


1316 


4 


US-09-702-705-323 


3 


664 


60.9 


1316 


4 


US-09-736-457-323 


4 


664 


60.9 


1316 


4 


US-09-614-124B-323 


5 


664 


60.9 


1316 


4 


US-09-671-325-323 


6 


664 


60.9 


1316 


4 


US-09-589-184-323 


7 


664 


60.9 


1316 


4 


US-09-658-824-323 


8 


331 


30.4 


17740 


4 


US-09-949-016-13086 


9 


316 


29.0 


585 


3 


US-09-123-912-92 


10 


316 


29.0 


585 


3 


US-09-643-597-92 


11 


316 


29.0 


585 


4 


US-09-480-884A-92 


12 


316 


29.0 


585 


4 


US-09-542-615A-92 


13 


316 


29.0 


585 


4 


US-09-606-421B-92 


14 


316 


29.0 


585 


4 


US-09-221-107-92 


15 


316 


29.0 


585 


4 


US-09-466-396A-92 


16 


316 


29.0 


585 


4 


US-09-476-496A-92 


17 


316 


29.0 


585 


4 


US-09-630-940B-92 



Descri ption 



Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
sequence 
Sequence 
Sequence 
sequence 



1344, Ap 
323, App 
323, App 
323, App 
323, App 
323, App 
323, App 
13086, A 
92, Appl 
92, Appl 
92, Appl 
92, Appl 
92, Appl 
92, Appl 
92, Appl 
92, Appl 
92, Appl 
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us- 


10-653 




18 


316 


29.0 


585 


4 


US-09 


c 


19 


316 


29.0 


858 


3 


US-09 


c 


20 


316 


29.0 


858 


3 


US-09 


c 


21 


316 


29.0 


858 


4 


US-09 


c 


22 


316 


29.0 


858 


4 


US-09 


c 


23 


316 


29.0 


858 


4 


US-09 


c 


24 


316 


29.0 


858 


4 


US-09 


c 


25 


316 


29.0 


858 


4 


US-09 


c 


26 


316 


29.0 


858 


4 


US-09 


c 


27 


316 


29.0 


858 


4 


US-09 


c 


28 


316 


29.0 


858 


4 


US-09 




29 


300 


27.5 


601 


4 


US-09 




30 


261 


23.9 


601 


4 


US-09 




31 


223 


20.5 


914 


4 


US-09 




32 


159 


14.6 


15141 


4 


US-09 




33 


156 


14.3 


601 


4 


US-09 




34 


156 


14.3 


601 


4 


US-09 




35 


144 


13.2 


233 


4 


US-09 




36 


144 


13.2 


233 


4 


US-09 




37 


144 


13.2 


233 


4 


US-09 




38 


144 


13.2 


233 


4 


US-09 




39 


144 


13.2 


233 


4 


US-09 




40 


144 


13.2 


233 


4 


US-09 




41 


118 


10.8 


601 


4 


US-09 




42 


118 


10.8 


601 


4 


US-09 




43 


26 


2.4 


1337 


3 


US-08 




44 


26 


2.4 


1337 


3 


US-09 




45 


22 


2.0 


292 


4 


US-09 



-681b-l.oligo. mi 

-285-479-92 

-123-912-91 

-643-597-91 

-480-884A-91 

-542-615A-91 

-606-421B-91 

-221-107-91 

-466-396A-91 

-476-496A-91 

-630-940B-91 

-285-479-91 

-949-016-46452 

-949-016-46451 

-949-016-3127 

-949-016-14869 

-949-016-46442 

-949-016-113475 

-702-705-31 

-736-457-31 

-614-124B-31 

-671-325-31 

-589-184-31 

-658-824-31 

-949-016-46437 

-949-016-113470 

-801-344-3 

-498-599-3 

-313-294A-6562 . 



Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
sequence 
sequence 
Sequence 
Sequence 



92, Appl 
91, Appl 
91, Appl 
91, Appl 
91, Appl 
91, Appl 
91, Appl 
91, Appl 
91, Appl 
91, Appl 
91, Appl 
46452, A 
46451, A 
3127, Ap 
14869, A 
46442, A 
113475, 
31, Appl 
31, Appl 
31, Appl 
31, Appl 
31, Appl 
31, Appl 
46437, A 
113470, 
3, Appli 
3, Appli 
6562, Ap 



ALIGNMENTS 



RESULT 1 

US-09-949-016-1344 

; Sequence 1344, Application US/09949016 
; Patent No. 6812339 
>; GENERAL INFORMATION: 

; applicant: venter, J. craig et al . 

; TITLE OF INVENTION: POLYMORPHISMS IN KNOWN GENES ASSOCIATED 

; TITLE OF INVENTION: WITH HUMAN DISEASE, METHODS OF DETECTION AND USES THEREOF 

; FILE REFERENCE: CL001307 

; CURRENT APPLICATION NUMBER: US/09/949 , 016 

; CURRENT FILING DATE: 2000-04-14 

; PRIOR APPLICATION NUMBER: 60/241,755 

; PRIOR FILING DATE: 2000-10-20 

; PRIOR APPLICATION NUMBER: 60/237,768 

; PRIOR FILING DATE: 2000-10-03 

; PRIOR APPLICATION NUMBER: 60/231,498 

; PRIOR FILING DATE: 2000-09-08 

; NUMBER OF SEQ ID NOS: 207012 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 1344 

; LENGTH: 1515 

; TYPE: DNA 

; ORGANISM: Human 

US-09-949-016-1344 

Query Match 66.9%; Score 729; DB 4; Length 1515; 

Best Local similarity 99.9%; Pred. No. 0; 

Matches 779; conservative 0; Mismatches 1; Indels 0; Gaps 0; 
Qy 311 CCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAACCAGTGACTAA 370 
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us-iu-obi-oolb-l.onqo. rm 




Db 


736 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 II 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




CCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAACCAGTGACTAA 


795 


Qy 


371 


CCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATCCAGTACTGCCACTCCAA 


430 




II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 

i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i I i i i i i I I i i i i i i i i t i i i i i i i 




Db 


796 


CCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATCCAGTACTGCCACTCCAA 


855 


Qy 


A "D 1 

431 


GGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTTGGGCCAAGCC 


490 




1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


856 


GGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTTGGGCCAAGCC 


915 


Qy 


4yi 


A f A A /~ A 1 1 f f~ r~T~ C /~~T~ f f A C /~ A ~T f~ f~ f~ A A /"*" A 1 1 A A /—/'■* A /~ A TT/~/"T/"/** AAA A A A A A A 

AGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGA M 


CCA 

5 5U 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


916 


AGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATTGCTGCAAAGCACAAAAA 


975 


Qy 


rri 

551 


AACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTGTCATCCCCAA 


CIA 

510 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


976 


AACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTGTCATCCCCAA 


1035 


Qy 


611 


GTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTAAATTGAGTGA 


b/U 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1036 


GTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTAAATTGAGTGA 


1095 


Qy 


671 


T"^~ A ■ A y~ A T /""* A A A T" A ^"•T/""* A ^/ 1 1 ■ ^** J A A A AAA ^"T/^ A /*** Z**' Z^* ,/*""/•*' ""IT /"* T A A /"*"/" I /■ ■ ■ I /"""Z"* A 

TGAGGAGATGGCAACCATACTCAGC M CAACAGAAACTGGAGGGCCTGTAACGTGTTGCA 


/ 30 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

llllltllllllllllllllllllllllllllllllllllllllllllllllllllllll 




Db 


1096 


TGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTAACGTGTTGCA 


1155 


Qy 


731 


A *f ^» *T" Z 1 ■ » A ill <"-»,/— 1 A A A ✓""T A " r* /*** ^** / ' 1 1 y~* Z— A "T"/~ /™ A Z""' A A *T* A *f '>*/*** A 1 I /"» A A T/"" 1 /**■/*" "1"" * 1 Z"* 

ATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGAGGTTGAATCTCCTGGTG 


/90 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 

| | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Itlllllllllllllllllllllilllllll 




Db 


1156 


ATCCTCTCATTTGGAAGACTATCCCTTCAATGCAGAATATTGAGGTTGAATCTCCTGGTG 


1215 


Qy 


791 


AGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACTCATGTCCCAT 


OCA 

850 




1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 

1 1 1 1 1 1 1 1 I I I I 1 I I I I I < I I 1 i > i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i > 




Db 


1216 


AGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACTCATGTCCCAT 


1275 


Qy 


851 


-f -f ■ t a / — A A /~~f"T" A 111 A A /""" A "TT/~ A /*• A /~~T~ /~~ A A /~ 1 1 ' A / TT/~ /—"!"/""" 1 1 A "T* A /~ A f f— A /~ A A "T" /*"/*" A fT~ 

TTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACGAGAATCGAGG 


Q1 A 

yiu 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

JlllllllltllllllllllllllllllllllllllltllllltlllJIIIJIIIIItll 


1335 


Db 


1276 


TTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACGAGAATCGAGG 


Qy 


911 


T/'/'T/' i I i I a /~ A /~ A | | 1 a *T 1 | /~~T~^""T~ A **f~x~ 1 I /— A A / * 1 A ^~/~ A ~T/~ A /" A A T* A ~T" f A^"A/~ A A A A^""/""*A 

TGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCACAGAAAAGCA 


y/u 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


1395 


Db 


1336 


TGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCACAGAAAAGCA 

• 


Qy 


971 


TGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAAATGTTTATTA 


1030 




llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 


1455 


Db 


1396 


TGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAAATGTTTATTA 


Qy 


1031 


AGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAAAAAATAATAATCAT 


1090 




1 1 II II MM 1 III ill Mill II II III III II II II II Ml II II ill MM II MM 


1515 


Db 


1456 


AGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAAAAAATAATAATCAT 
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us-10-653-681b-l.Oligo.rnpb 

GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



Searched: 
Word size 



August 9, 2005, 08:43:09 ; Search time 839 Seconds 

(without alignments) 
8421.615 Mil lion cell updates/sec 

US-10-653-681B-1 
1090 

1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 

OLIGO_NUC 

Gapop 60.0 , Gapext 60.0 
7297361 seqs, 3241162794 residues 
0 



Total number of hits satisfying chosen parameters: 



14594722 



Minimum DB seq length: 
Maximum DB seq length: 



2000000000 



Post-processing: Listing first 45 summaries 



Database : 



Pub! i shed_Appl i cati ons_NA : * 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 



/cgn2_6/ptodata/2/pubpna/us07_PUBCOMB . seq : * 
/cgn2_6/ptodata/2/pubpna/PCT_NEW_PUB . seq : * 
/cgn2_6/ptodata/2/pubpna/US06_NEW_PUB . seq : * 
/cgn2_6/ptodata/2/pubpna/US06_PUBCOMB . seq : * 
/cgn2_6/ptodata/2/pubpna/US07_NEW_PUB . seq : * 
/cgn2_6/ptodata/2/pubpna/PCTUS_PUBCOMB . seq : 
/cgn2_6/ptodata/2/pubpna/US08_NEW_PUB . seq : * 
/cgn2_6/ptodata/2/pubpna/US08_PUBCOMB . seq : * 
/cgn2_6/ptodata/2/pubpna/US09A^PUBCOMB . seq : 
/cgn2_6/ptodata/2/pubpna/US09B_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/US09c_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/US09_NEW_PUB . seq : 
/cgn2_6/ptodata/2/pubpna/USlOA_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USlOB_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USlOC_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0D_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0E_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0F_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0G_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0H_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0l_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0_NEW_PUB . seq : 
/cgn2_6/ptodata/2/pubpna/USllA^PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USll_NEW_PUB . seq : 
/cgn2_6/ptodata/2/pubpna/US60_NEW_PUB . seq : 
/cgn2_6/ptodata/2/pubpna/US60_PUBCOMB . seq : 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



% 
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us-10-653-681b-l.oligo.rnpb 



Result 

NO. 


Score 


Querv 
Match 


Length DB 


ID 




1 


1090 


100,0 


1090 


21 


US-10-653-681A-1 




2 


729 


66.9 


1508 


9 


US-09-925-299-150 




3 


729 


66.9 


1508 


10 


Us-09-925-299-150 




4 


664 


60.9 


1316 


9 


US-09-736-457-323 




5 


664 


60.9 


1316 


9 


US-09-902-941-323 




6 


664 


60.9 


1316 


9 


US-09-849-626-323 




7 


664 


60.9 


1316 


10 


US-09-476-300-323 




8 


664 


60.9 


1316 


14 


US-10-017-754-323 




9 


664 


60.9 


1316 


15 


US-10-102 -524-1731 




10 


664 


60.9 


1316 


16 


US-10-113-872-323 




11 


664 


60.9 


1316 


17 


US-10-283-017-323 




12 


487 


44. 7 


1279 


21 


US-10-653-681A-3 




13 


328 


30.1 


364 


10 


US-09-803-719-2393 




.14 


316 


29.0 


585 


9 


us _09-735-705-92 




15 


316 


29.0 


585 


9 


US-09-850-716A-92 




16 


316 


29.0 


585 


9 


us _09-897-778-92 




17 


316 


29.0 


585 


10 


US-09-466-396A-92 




18 


316 


29.0 


585 


14 


US-10-007-700-92 




19 


316 


29.0 


585 


15 


US-10-117-982-92 




20 


316 


29.0 


585 


17 


US-10-313-986-92 




21 


316 


29.0 


585 


20 


US-10-775-972-92 




22 


316 


29.0 


585 


22 


US-10-922-124-92 


c 


23 


316 


29.0 


858 


9 


us _09-735-705-91 


c 


24 


316 


29.0 


858 


9 


US-09-850-716A-91 


c 


25 


316 


29.0 


858 


9 


US-09-897-778-91 


c 


26 


316 


29.0 


858 


10 


US-09-466-396A-91 


c 


27 


316 


29.0 


858 


14 


US-10-007-700-91 


c 


28 


316 


29.0 


858 


15 


US-10-117-982-91 


c 


29 


316 


29.0 


858 


17 


US-10-313-986-91 


c 


30 


316 


29.0 


858 


20 


US-10-775-972-91 


c 


31 


316 


29.0 


858 


22 


US-10-922-124-91 




32 


304 


27.9 


356 


10 


us _09-803-719-2391 


c 


33 


159 


14^6 


198 


16 


Us-10-029-386-18151 


c 


34 


159 


14* 6 


546 


16 


US-10-029- 386-4451 




35 


159 


14 6 


951 


15 


US-10-2 74-694-36 




36 


159 


14 6 


951 


20 


US-10-332-448-36 




37 


159 


14.6 


1315 


15 


US-10-274-375-1 




38 


144 


13.2 


233 


9 


US-09-736-457-31 




39 


144 


13.2 


233 


9 


US-09-902-941-31 




40 


144 


13.2 


233 


9 


US-09-849-626-31 




41 


144 


13.2 


233 


10 


US-09-476-300-31 




42 


144 


13.2 


233 


14 


US-10-017-754-31 




43 


144 


13.2 


233 


16 


US-10-113-872-31 




44 


144 


13.2 


233 


17 


US-10-283-017-31 




45 


116 


10.6 


1816 


14 


US-10-198-846-14071 



Description 



Sequence 1, Appli 
Sequence 150, App 

Sequence 150, App 
Sequence 323, App 
Sequence 323, App 
Sequence 323, App 
Sequence 323, App 
Sequence 323, App 
Sequence 1731, Ap 
Sequence 323, App 
Sequence 323, App 
Sequence 3, Appli 
Sequence 2393, Ap 
Sequence 92, Appl 
Sequence 92, Appl 
Sequence 92, Appl 
Sequence 92, Appl 
sequence 92, Appl 
Sequence 92, Appl 
Sequence 92, Appl 
Sequence 92, Appl 
Sequence 92, Appl 
Sequence 91, Appl 
Sequence 91, Appl 
Sequence 91, Appl 
Sequence 91, Appl 
Sequence 91, Appl 
Sequence 91, Appl 
Sequence 91, Appl 
Sequence 91, Appl 
Sequence 91, Appl 
Sequence 2391, Ap 
Sequence 18151, A 
Sequence 4451, Ap 
Sequence 36, Appl 
Sequence 36, Appl 
Sequence 1, Appli 
Sequence 31, Appl 
Sequence 31, Appl 
Sequence 31, Appl 
Sequence 31, Appl 
Sequence 31, Appl 
Sequence 31, Appl 
Sequence 31, Appl 
Sequence 14071, A 



ALIGNMENTS 



RESULT 1 

US-10-653-681A-1 

; Sequence 1, Application US/10653681A 

; Publication No. US20050048503A1 

; GENERAL INFORMATION: 

; APPLICANT: DAI, KEN-SHWO 

; TITLE OF INVENTION: HUMAN ARL-RELATED GENE VARIANTS ASSOCIATED WITH CANCER 
; FILE REFERENCE: U014798-3 

; CURRENT APPLICATION NUMBER: US/10/653 , 681A 
; CURRENT FILING DATE: 2003-09-02 
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us-10-653-681b-l.Oligo.rnpb 

; NUMBER OF SEQ ID NOS : 4 

; SOFTWARE: Patentln version 3.2 

; SEQ ID NO 1 

; LENGTH: 1090 

; TYPE: DNA 

; ORGANISM: ARTIFICIAL SEQUENCE 
; FEATURE: 

; OTHER INFORMATION: VARIANT OF HUMAN ALDOSE REDUCTASE-LIKE GENE 

; FEATURE: 

; NAME/KEY: CDS 

; LOCATION: (70).. (333) 

US-10-653-681A-1 



Query Match 100.0%; Score 1090; DB 21; Length 1090; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1090; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qv 


1 


CAAAAACAGCAACAGAAAGCAGGACGTGAGACTTCTACCTGCTCACTCAGAATCATTTCT 


60 




IIIIIIIIIIIIIIMIIIIIIIMIIIIIIIIIIIIMMIIIIIIIIIIIIIIIIIII 




Db 


1 


CAAAAACAGCAACAGAAAGCAGGACGTGAGACTTCTACCTGCTCACTCAGAATCATTTCT 


60 


QV 


61 


GCACCAACCATGGCCACGTTTGTGGAGCTCAGTACCAAAGCCAAGATGCCCATTGTGGGC 


120 




Mill Mill III! MM 'M 1 1 1 1 1 1 I 1 ! 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 I 1 i 1 1 




Db 


61 


GCACCAACCATGGCCACGTTTGTGGAGCTCAGTACCAAAGCCAAGATGCCCATTGTGGGC 


120 


QV 


121 


CTGGGCACTTGGAAGTCTCCTCTCGGCAAAGTGAAAGAAGCAGTGAAGGTGGCCATTGAT 


180 




1 II 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


CTGGGCACTTGGAAGTCTCCTCTCGGCAAAGTGAAAGAAGCAGTGAAGGTGGCCATTGAT 


180 


QV 


181 


GCAGGATATCGGCACATTGACTGTGCCTATGTCTATCAGAATGAACATGAAGTGGGGGAA 


240 




MIMMIMMMIMMMMMMMIMMMMMMMIMMMMMMMI 




Db 


181 


GCAGGATATCGGCACATTGACTGTGCCTATGTCTATCAGAATGAACATGAAGTGGGGGAA 


240 


Ov 


241 


GCCATCCAAGAGAAGATCCAAGAGAAGGCTGTGAAGCGGGAGGACCTGTTCATCGTCAGC 


300 




MIMMMMMMMIMMIMMIMMIMMMMMMMMMMMMMI 




Db 


241 


GCCATCCAAGAGAAGATCCAAGAGAAGGCTGTGAAGCGGGAGGACCTGTTCATCGTCAGC 


300 


Qy 


301 


AAGTTGTGGCCCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAAC 


360 




MIMMIMMIIMMMIMMMIMMIMMMMMIMMMMIMMMI 




Db 


301 


AAGTTGTGGCCCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAAC 


360 


Qy 


361 


CAGTGACTAACCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATCCAGTACT 


420 




MIMMIMMIMMIMMMMMIMMMMIMMIMMIMIMMMII! 




Db 


361 


CAGTGACTAACCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATCCAGTACT 


420 


Qy 


421 


GCCACTCCAAGGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTT 


480 




Mill II Ml Ml II M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 M 1 1 1 1 1 Ml 1 1 




Db 


421 


GCCACTCCAAGGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTT 


480 


Qy 


481 


GGGCCAAGCCAGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATTGCTGCAA 


540 




MIM Mill 1 1 M 1 1 1 1 II 1 II II II 1 II II 1 II II II 1 II 1 1 M 1 1 II M 1 1 1 II 1 II 




Db 


481 


GGGCCAAGCCAGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATTGCTGCAA 


540 


Qy 


541 


AGCACAAAAAAACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTG 


600 




Mill MINIM 1 1 1 II 1 1 III 1 1 M M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 




Db 


541 


AGCACAAAAAAACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTG 


600 


Qy 


601 


TCATCCCCAAGTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTA 


660 




II M 1 II M 1 II 1 II II M 1 1 1 M 1 II 1 II 1 1 II 1 1 M 1 1 1 M II M II II 1 M 1 II 1 1 1 




Db 


601 


TCATCCCCAAGTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTA 


660 


Qy 


661 


AATTGAGTGATGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTA 


720 




Mill Mill M 1 Ml 1 M 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 II 1 1 1 1 1 1 1 M II M 1 II 1 1 1 1 1 1 
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Db 


661 


Qy 


721 


Db 


721 


Qy 


781 


Db 


781 


Qy 


841 


Db 


841 


Qy 


901 


Db 


901 


Qy 


961 


Db 


961 


Qy 


1021 


Db 


1021 


Qy 


1081 


Db 


1081 



us-10-653-681b-l.Oligo.rnpb 
AATTGAGTGATGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTA 720 

ACGTGTTGCAATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGAGGTTGAA 780 

I ! I I I II I I I I I I I I !l II I I I II I II I I I I II I II I I i I I M I I I Ml M Ml I I I I I 

ACGTGTTGCAATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGAGGTTGAA 780 
TCTCCTGGTGAGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACT 840 

I M 1 1 I M 1 1 1 1 1 1 I I M M M I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I I I I M M M I I I I II 

TCTCCTGGTGAGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACT 840 
CATGTCCCATTTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACG 900 

M M M M M M M M M M M MM M M M M M M M M M M M M II IN M M I 

CATGTCCCATTTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACG 900 
AGAATCGAGGTGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCA 960 

I II M M II M I M M M 1 1 M M I M M MM M M M Ml M M I M MM M M M I 

AGAATCGAGGTGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCA 960 
CAGAAAAGCATGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAA 102 0 

I I I 1 1 1 1 I 1 1 1 1 I I I I I I 1 1 I I I II 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 I I I 1 1 1 1 1 1 II I M I 

CAGAAAAGCATGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAA 102 0 
ATGTTTATTAAGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAAAAAA 1080 

1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M I 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

ATGTTTATTAAGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAAAAAA 1080 

TAATAATCAT 1090 

I I I I 1 I I I I i 
TAATAATCAT 1090 
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us-10-653-681b-l.rni 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence: 

scoring table: 



August 9, 2005, 00:34:58 ; Search time 226 Seconds 

(without alignments) 
7891.777 Million cell updates/sec 

US-10-653-681B-1 
1090 

1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



2405568 



Searched: 1202784 seqs, 818138359 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_NA: * 

1: /cgn2_6/ptodata/l/i na/5A_COMB . seq : * 

2 : /cgn2_6/ptodata/l/i na/5 B_COMB . seq : * 

3 : /cgn2_6/ptodata/l/i na/6A_COMB . seq : * 

4 : /cgn2_6/ptodata/l/i na/6B_COMB . seq : * 

5 : /cgn2_6/ptodata/l/i na/PCTUS_COMB . seq : * 

6 : /cgn2__6/ptodata/l/i na/backf i 1 esl . seq : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Description 



Result 




Query 








No. 


score 


Match 


Length DB 


ID 


1 


828.2 


76.0 


1515 


4 


US-09-949-016-1344 


2 


796.6 


73.1 


1316 


4 


US-09-702-705-323 


3 


796.6 


73.1 


1316 


4 


US-09-736-457-323 


4 


796.6 


73.1 


1316 


4 


US-09-614-124B-323 


5 


796.6 


73.1 


1316 


4 


US-09-671-325-323 


6 


796.6 


73.1 


1316 


4 


US-09-589-184-323 


7 


796.6 


73.1 


1316 


4 


US-09-658-824-323 


8 


389.2 


35.7 


914 


4 


US-09-949-016-3127 


9 


358.4 


32.9 


17740 


4 


US-09-949-016-13086 


10 


328.6 


30.1 


601 


4 


US-09-949-016-46452 


11 


316 


29.0 


585 


3 


US-09-123-912-92 


12 


316 


29.0 


585 


3 


US-09-643-597-92 


13 


316 


29.0 


585 


4 


US-09-480-884A-92 


14 


316 


29.0 


585 


4 


US-09-542-615A-92 


15 


316 


29.0 


585 


4 


US-09-606-421B-92 


16 


316 


29.0 


585 


4 


US-09-221-107-92 


17 


316 


29.0 


585 


4 


US-09-466-396A-92 



Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 



1344, Ap 
323, App 
323, App 
323, App 
323, App 
323, App 
323, App 
3127, Ap 
13086, A 
46452, A 
92, Appl 
92, Appl 
92, Appl 
92, Appl 
92, Appl 
92, Appl 
92, Appl 
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us-10-653-681b-l.rni 

18 316 29.0 585 4 US-09-476-496A-92 Sequence 92, Appl 

19 316 29.0 585 4 US-09-630-940B-92 Sequence 92, Appl 

20 316 29.0 585 4 US-09-285-479-92 Sequence 92, Appl 
c 21 316 29.0 858 3 US-09-123-912-91 Sequence 91, Appl 
c 22 316 29.0 858 3 US-09-643-597-91 Sequence 91, Appl 
c 23 316 29.0 858 4 US-09-480-884A-91 Sequence 91, Appl 
c 24 316 29.0 858 4 US-09-542-615A-91 Sequence 91, Appl 
c 25 316 29.0 858 4 US-09-606-421B-91 Sequence 91, Appl 
c 26 316 29.0 858 4 US-09-221-107-91 Sequence 91, Appl 
c 27 316 29.0 858 4 US-09-466-396A-91 Sequence 91, Appl 
c 28 316 29.0 858 4 US-09-476-496A-91 Sequence 91, Appl 
c 29 316 29.0 858 4 US-09-630-940B-91 Sequence 91, Appl 
c 30 316 29.0 858 4 US-09-285-479-91 Sequence 91, Appl 

31 304.8 28.0 1335 4 US-09-023-655-1010 Sequence 1010, Ap 

32 292 26.8 1337 3 US-08-801-344-3 Sequence 3, Appli 

33 292 26.8 1337 3 US-09-498-599-3 Sequence 3, Appli 

34 288.4 26.5 601 4 US-09-949-016-46451 Sequence 46451, A 

35 170.2 15.6 1290 4 US-09-270-767-13724 Sequence 13724, A 

36 169.8 15.6 15141 4 US-09-949-016-14869 Sequence 14869, A 

37 165.4 15.2 601 4 US-09-949-016-46442 Sequence 46442, A 

38 165.4 15.2 601 4 US-09-949-016-113475 Sequence 113475, 

39 145.4 13.3 233 4 US-09-702-705-31 Sequence 31, Appl 

40 145.4 13.3 233 4 US-09-736-457-31 Sequence 31, Appl 

41 145.4 13.3 233 4 US-09-614-124B-31 Sequence 31, Appl 

42 145.4 13.3 233 4 US-09-671-325-31 Sequence 31, Appl 

43 145.4 13.3 233 4 US-09-589-184-31 Sequence 31, Appl 

44 145.4 13.3 233 4 US-09-658-824-31 Sequence 31, Appl 

45 141.6 13.0 292 4 US-09-313-294A-6562 Sequence 6562, Ap 



ALIGNMENTS 



RESULT 1 

US-09-949-016-1344 

; Sequence 1344, Application US/09949016 

; Patent No. 6812339 

; GENERAL INFORMATION: 

; APPLICANT: VENTER, 3 . Craig et al . 

; TITLE OF INVENTION: POLYMORPHISMS IN KNOWN GENES ASSOCIATED 

; TITLE OF INVENTION: WITH HUMAN DISEASE, METHODS OF DETECTION AND USES THEREOF 

; FILE REFERENCE: CL001307 

; CURRENT APPLICATION NUMBER: US/09/949 , 016 

; CURRENT FILING DATE: 2000-04-14 

; PRIOR APPLICATION NUMBER: 60/241,755 

; PRIOR FILING DATE: 2000-10-20 

; PRIOR APPLICATION NUMBER: 60/237,768 

; PRIOR FILING DATE: 2000-10-03 

; PRIOR APPLICATION NUMBER: 60/231,498 

; PRIOR FILING DATE: 2000-09-08 

; NUMBER OF SEQ ID NOS: 207012 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 1344 

; LENGTH: 1515 

; TYPE: DNA 

; organism: Human 

US-09-949-016-1344 

Query Match 76.0%; Score 828.2; DB 4; Length 1515; 

Best Local Similarity 81.3%; Pred. No. 1.2e-247; 

Matches 1087; Conservative 0; Mismatches 3; Indels 247; Gaps 1; 
Qy 1 CAAAAACAGCAACAGAAAGCAGGACGTGAGACTTCTACCTGCTCACTCAGAATCATTTCT 60 
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Qy 


bl 


Db 


TOO 
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121 


Db 
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Qy 


1 O "1 
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Db 


359 


Qy 


241 
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419 


Qy 


301 
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Qy 


Oil 

311 


Db 


539 


Qy 


"5 "I 1 

311 


Db 


599 


Qy 


Til 

311 


Db 


. 659 


Qy 


311 


Db 


719 


Qy 


354 


Db 


779 


Qy 


A 1 A 

414 


Db 


839 


Qy 


A T >l 

474 


Db 


899 


Qy 


534 


Db 


959 


Qy 


594 


Db 


1019 


Qy 


654 


Db 


1079 



us-10-653-681b-l.rni 
I I 1 1 1 I I I I I I I 1 1 II I I I M I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I M I 

CAAAAACAGCAACAGAGAGCAGGACGTGAGACTTCTACCTGCTCACTCAGAATCATTTCT 238 

GCACCAACCATGGCCACGTTTGTGGAGCTCAGTACCAAAGCCAAGATGCCCATTGTGGGC 12 0 
I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GCACCAACCATGGCCACGTTTGTGGAGCTCAGTACCAAAGCCAAGATGCCCATTGTGGGC 298 

CTGGGCACTTGGAAGTCTCCTCTCGGCAAAGTGAAAGAAGCAGTGAAGGTGGCCATTGAT 180 

I MM I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I 1 1 1 1 1 I II 1 1 II I I II I I 1 1 1 1 1 1 1 1 1 1 I 1 1 II 1 1 

CTGGGCACTTGGAAGTCTCCTCTTGGCAAAGTGAAAGAAGCAGTGAAGGTGGCCATTGAT 358 
GCAGGATATCGGCACATTGACTGTGCCTATGTCTATCAGAATGAACATGAAGTGGGGGAA 2 40 

M 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 M I I I I 1 1 1 1 1 1 1 1 1 1 II I 1 1 I I I I I I I I I I I M I I I I I I I 

GCAGGATATCGGCACATTGACTGTGCCTATGTCTATCAGAATGAACATGAAGTGGGGGAA 4 18 
GCCATCCAAGAGAAGATCCAAGAGAAGGCTGTGAAGCGGGAGGACCTGTTCATCGTCAGC 300 

IIMIIIIMIIIIIMIIIIIIIIIIIIIIMIMIIIIIIIIIIIIIMIIIIIMM 

GCCATCCAAGAGAAGATCCAAGAGAAGGCTGTGAAGCGGGAGGACCTGTTCATCGTCAGC 478 



I I I I I I I I 



310 



310 

5 39 AAGGACCTGAAGCTGAGCTATCTGGACGTCTATCTTATTCACTGGCCACAGGGATTCAAG 598 
310 

599 TCTGGGGATGACCTTTTCCCCAAAGATGATAAAGGTAATGCCATCGGTGGAAAAGCAACG 658 
310 

659 TTCTTGGATGCCTGGGAGGCCATGGAGGAGCTGGTGGATGAGGGGCTGGTGAAAGCCCTT 718 

_ CCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAA 353 

I 1 1 II I M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 I 

GGGGTCTCCAATTTCAGCCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAA 778 

TATAAACCAGTGACTAACCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATC 413 

I M 1 1 1 M 1 1 II I M I M I M M 1 1 1 1 1 1 1 1 1 M M M I Ml M I M 1 1 1 M M I M I M 

TATAAACCAGTGACTAACCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATC 838 

CAGTACTGCCACTCCAAGGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGAT 473 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I II I M 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 

CAGTACTGCCACTCCAAGGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGAT 898 

AGACCTTGGGCCAAGCCAGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATT 533 

I I M 1 1 1 1 1 II 1 1 M 1 1 1 1 II 1 1 1 1 1 II 1 1 1 Ml 1 1 1 1 1 1 II 1 1 II 1 1 II 1 1 1 M I MM 

AGACCTTGGGCCAAGCCAGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATT 958 

GCTGCAAAGCACAAAAAAACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAAT 593 
I M I I I I I M I I I I I I I I I I I I I Ml I I III I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

GCTGCAAAGCACAAAAAAACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAAT 1018 

GTGATTGTCATCCCCAAGTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTT 653 
I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

GTGATTGTCATCCCC AAGTCTGTG ACACCAGCACGCATTGTTGAGAACATTCAGGTCTTT 1078 

GACTTTAAATTGAGTGATGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGG 713 

I II I 1 1 1 1 1 1 1 I MM I 1 1 I M 1 1 I I M 1 1 I Ml M 1 1 I I II I 1 1 M I I II 1 1 M I I II I 

GACTTTAAATTGAGTGATGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGG 113 8 
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us-10-653-681b-l.rni 
QV 714 GCCTGTAACGTGTTGCAATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGA 773 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 1139 GCCTGTAACGTGTTGCAATCCTCTCATTTGGAAGACTATCCCTTCAATGCAGAATATTGA 1198 

QV 774 GGTTGAATCTCCTGGTGAGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTAC 833 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1199 GGTTGAATCTCCTGGTGAGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTAC 1258 

Qy 834 CTCCACTCATGTCCCATTTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTT 893 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 1259 CTCCACTCATGTCCCATTTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTT 1318 

Qy 894 ATAGACGAGAATCGAGGTGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAG 953 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I 
Db 1319 ATAGACGAGAATCGAGGTGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAG 1378 

Qy 954 AATATCACAGAAAAGCATGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATC 1013 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1379 AATATCACAGAAAAGCATGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATC 1438 

Qy 1014 AGAACAAATGTTTATTAAGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATA 1073 

I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1439 AGAACAAATGTTTATTAAGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATA 1498 

Qy 1074 AAAAAAATAATAATCAT 1090 

II I I I I I II Ml I I I II 

Db 1499 AAAAAAATAATAATCAT 1515 



Page 4 



us-10-653-681b-l.rnpb 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM nucleic 
Run on: 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



Searched: 



nucleic search, using sw model 

August 9, 2005, 06:00:04 ; Search time 1792 Seconds 

(without alignments) 
3942.932 Million cell updates/sec 

US-10-653-681B-1 
1090 

1 caaaaacagcaacagaaagc ataaaaaaaataataatcat 1090 

IDENTITY_NUC 
Gapop 10.0 , Gapext 1.0 



7297361 seqs, 3241162794 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



14594722 



Post-processing: 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

Publ i shed_Appl i cati ons_NA : * 



1 

2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 



/cgn2_6/ptodata/2/pubpna/US07_PUBCOMB . seq : * 
/cgn2_6/ptodata/2/pubpna/PCT_NEW_PUB . seq : * 
/cgn2_6/ptodata/2/pubpna/US06_NEW_PUB . seq : * 
/cgn2_6/ptodata/2/pubpna/US06_PUBCOMB . seq : * 
/cgn2_6/ptodata/2/pubpna/US07_NEW_PUB . seq : * 
/cgn2_6/ptodata/2/pubpna/PCTUS_PUBCOMB . seq : 
/cgn2_6/ptodata/2/pubpna/US08_NEW_PUB . seq : * 
/cgn2_6/ptodata/2/pubpna/US08_PUBCOMB . seq : * 
/cgn2_6/ptodata/2/pubpna/US09A^PUBCOMB . seq : 
/cgn2_6/ptodata/2/pubpna/US09B_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/US09C_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/US09_NEW_PUB . seq : 
/cgn2_6/ptodata/2/pubpna/USlOA_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USlOB_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USlOC_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USlOD_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0E_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0F_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0G_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0H_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0l_PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl0_NEW_PUB . seq : 
/cgn2_6/ptodata/2/pubpna/usllA^PUBCOMB . seq 
/cgn2_6/ptodata/2/pubpna/USl!LNEW_PUB . seq : 
/cgn2_6/ptodata/2/pubpna/US60_NEW_PUB . seq : 
/cgn2_6/ptodata/2/pubpna/US60_PUBCOMB . seq : 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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Sequence 
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Sequence 
Sequence 
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323, App 
323, App 
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1, Appli 
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91, Appl 
97, Appl 
1010, Ap 
332, App 
! 1, Appli 
641, App 
641, App 
289, App 
21, Appl 
128, App 



ALIGNMENTS 



RESULT 1 

US-10-653-681A-1 

; Sequence 1, Application US/10653681A 

; Publication No. US20050048503A1 

; GENERAL INFORMATION: 

; APPLICANT: DAI, KEN-SHWO 

; TITLE OF INVENTION: HUMAN ARL-RELATED GENE VARIANTS ASSOCIATED WITH CANCER 
; FILE REFERENCE: U014798-3 

; CURRENT APPLICATION NUMBER: US/10/653 , 681A 
; CURRENT FILING DATE: 2003-09-02 
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us-10-653-681b-l.rnpb 

NUMBER OF SEQ ID NOS : 4 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 1 
LENGTH: 1090 
TYPE: DNA 

ORGANISM: ARTIFICIAL SEQUENCE 
FEATURE: 

OTHER INFORMATION: VARIANT OF HUMAN ALDOSE REDUCTASE- LIKE GENE 
FEATURE: 
NAME/KEY: CDS 

location: (70).. (333) 
US-10-653-681A-1 

Query Match 100.0%; Score 1090; DB 21; Length 1090; 

Best Local similarity 100.0%; Pred. No. 0; 

Matches 1090; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

CAAAAACAGCAACAGAAAGCAGGACGTGAGACTTCTACCTGCTCACTCAGAATCATTTCT 60 

I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II I II I II 1 1 1 1 II I 

CAAAAACAGCAACAGAAAGCAGGACGTGAGACTTCTACCTGCTCACTCAGAATCATTTCT 60 

GCACCAACCATGGCC ACGTTTGTGG AGCTCAGTACCAAAGCCAAGATGCCCATTGTGGGC 12 0 
I I I 1 II I II II Ml I I I II I I I I MM I II II I I I I I I I I! I I I I II I I I I II II I I I I I 
GCACCAACCATGGCC ACGTTTGTGG AGCTCAGTACCAAAGCCAAGATGCCCATTGTGGGC 120 

CTGGGCACTTGGAAGTCTCCTCTCGGCAAAGTGAAAGAAGCAGTGAAGGTGGCCATTGAT 180 

MMMMMMMMMMMMMMMMMMMMMIMMMMMMMMI 

CTGGGCACTTGGAAGTCTCCTCTCGGCAAAGTGAAAGAAGCAGTGAAGGTGGCCATTGAT 180 
GCAGGATATCGGCACATTGACTGTGCCTATGTCTATCAGAATGAACATGAAGTGGGGGAA 2 40 

M 1 1 1 1 1 II 1 1 1 1 1 1 III 1 1 1 1 1 II 1 1 1 Ml 1 1 M 1 1 II M 1 1 II M 1 1 M II II 1 1 1 1 1 

GCAGGATATCGGCACATTGACTGTGCCTATGTCTATCAGAATGAACATGAAGTGGGGGAA 2 40 
GCCATCCAAGAGAAGATCCAAGAGAAGGCTGTGAAGCGGGAGGACCTGTTCATCGTCAGC 300 

I II I M M I II II 1 1 1 1 II II II III 1 1 M II M I II M I M I II II M 1 1 II II I II 1 1 

GCCATCCAAGAGAAGATCCAAGAGAAGGCTGTGAAGCGGGAGGACCTGTTCATCGTCAGC 300 

AAGTTGTGGCCCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAAC 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AAGTTGTGGCCCACTTCCAGATCGAGAAGCTCTTGAACAAACCTGGACTGAAATATAAAC 360 

CAGTGACTAACCAGGTTGAGTGTCACCCATACCTCACGCAGGAGAAACTGATCCAGTACT 420 

I M I II II 1 1 1 1 1 1 M 1 1 M I M II I M 1 1 1 1 1 1 Ml I II I II M Ml II I M 1 1 Ml 1 1 

CAGTGACTAACCAGGTTGAGTGTCACCCATACCTCACGCAGGAG AAACTGATCCAGTACT 42 0 

GCCACTCCAAGGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTT 480 
M I II I M I II II II I I I I II I I II II I II II I II II I I I I I I I I M I M I M I I M M I 
GCCACTCCAAGGGCATCACCGTTACGGCCTACAGCCCCCTGGGCTCTCCGGATAGACCTT 480 

GGGCCAAGCCAGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGGAGATTGCTGCAA 5 40 

I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I 1 I I I 1 I I I I I i I I I I I I 
GGGCCAAGCC AGAAGACCCTTCCCTGCTGGAGGATCCCAAGATTAAGG AGATTGCTGCAA 5 40 

AGCACAAAAAAACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTG 600 
I II I II M I II M I I II II II I I I II I M I II II I II I II I M II II I II I II II II I I I 
AGCACAAAAAAACCGCAGCCCAGGTTCTGATCCGTTTCCATATCCAGAGGAATGTGATTG 600 

TCATCCCCAAGTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTA 660 

I III I I MM II II I II I I II I I II II I M II I I II I I I II II II II II I I II MM III 
TCATCCCCAAGTCTGTGACACCAGCACGCATTGTTGAGAACATTCAGGTCTTTGACTTTA 660 

AATTGAGTGATGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTA 720 

II M I I II I II I II II I I I II I III I I III II I II II I MM I II II I II III II I Ml I 
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Qy 
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Db 
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Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 


Qy 


601 


Db 


601 


Qy 


661 



us-10-653-681b-l.rnpb 
Db 661 AATTGAGTGATGAGGAGATGGCAACCATACTCAGCTTCAACAGAAACTGGAGGGCCTGTA 720 

Qy 721 ACGTGTTGCAATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGAGGTTGAA 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 ACGTGTTGCAATCCTCTCATTTGGAAGACTATCCCTTCGATGCAGAATATTGAGGTTGAA 780 

Qy 781 TCTCCTGGTGAGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACT 840 

I I I I II I I II I I I I I I II M I I I M I I 1 I M I I I I ! I I I I I I I I I I I I I M I II I I II I I 

Db 781 TCTCCTGGTGAGATTATACAGGAGATTCTCTTTCTTCGCTGAAGTGTGACTACCTCCACT 840 

Qy 841 CATGTCCCATTTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACG 900 

I I III I I III II I II I MM II III II Mill III II III II III II I II II Mill I II 

Db 841 CATGTCCCATTTTAGCCAAGCTTATTTAAGATCACAGTGAACTTAGTCCTGTTATAGACG 900 

Qy 901 AGAATCGAGGTGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCA 960 

I I I I I I I I I I I I I I I I I I I I M 1 1 I I I I I I I I I I I I I I I I I i M t I I I I I I I I I I f I I 1 1 

Db 901 AGAATCGAGGTGCTGTTTTAGACATTTATTTCTGTATGTTCAACTAGGATCAGAATATCA 960 

Qy 961 CAGAAAAGCATGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAA 1020 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 CAGAAAAGCATGGCTTGAATAAGGAAATGACAATTTTTTCCACTTATCTGATCAGAACAA 1020 

Qy 1021 ATGTTTATTAAGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAAAAAA 1080 

1 1 1 1 1 1 1 1 1 1 M I II I II II 1 1 1 II II I II 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 

Db 1021 ATGTTTATTAAGCATCAGAAACTCTGCCAACACTGAGGATGTAAAGATCAATAAAAAAAA 1080 

Qy 1081 TAATAATCAT 1090 

I I I I I I I I I I 
Db 1081 TAATAATCAT 1090 
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